Pandas cheat sheet

Along Numpy, Pandas is one of the most useful tools for data analysis via Python. The main focus of Pandas is to handle time series and perform statistical analysis. To achieve this, Pandas provides several methods and classes. In this post I will be updating short pieces of code with the most frequent functions that I use in Pandas.

In all the following examples I import pandas as pd:

Date Range

Creates a Pandas array with numpy datetime64 values:

or

Commonly used frequency alias:

  • T,min = minutes
  • S = seconds
  • H = hours

DataFrame

It works similar to a spreadsheet (i.e. columns/rows structure, filtering, operations for each cell, etc) .

A DataFrame filled with NaNs:

Group generator by time

Group operations

Slicing

For DataFrame:

For Series (single column DataFrame):

If index is a datetime value:

If index of df is sparse, it can help to slice a second dataframe just using the df index:

Reorder columns

Return numpy array

Add columns

Using df to create boolean columns

It works using other dataframes as long as they have the same index length

Leave a Comment