Simple Note for Python Beginners: Accessing Elements of Pandas DataFrame

For the developers who are experienced in programming languages Java or C#, it takes some time to discover the benefits of Python libraries. In my case, while I was used to writing complex iterative loops in many lines of code, I am really surprised and impressed by the functionalities of Pandas Dataframe. This is a code snippet from my latest codes.

df_grouped = df[['user_id', 'r1']].groupby(['user_id', 'r1']).agg({'r1':["count"]}).reset_index().groupby('user_id')

With a single line of code, I filtered the relevant columns, then I grouped data by two columns, then aggregated on the second column, and finally grouped again on the other column. You can extend this single line of code by adding other operations. This is not just simple, but also this code snippet runs faster than any code you write.

Also, the dataframe comes with very useful methods, such as iloc and loc. While both methods give you access the row by selecting the index, its basic difference is the iloc takes an index as an argument, and loc takes the name of the index as argument. For additional information, you may find good examples to index-level operations on Pandas dataframes.

Using iloc, loc, & ix to select rows and columns in Pandas DataFrames


Leave a Reply

Your email address will not be published. Required fields are marked *