Python: Pivot Tables with Pandas

Pandas provides Python with lots of advanced data management tools. Being able to create pivots tables is one of the cooler tools.

You can download the data set I will be working with here: KMeans1

First, let’s upload our data into Python

import pandas as pd
df = pd.read_excel("C:\Users\Benjamin\Documents\KMeans1.xlsx")
df.head()

pandasPivotTable.jpg

Let’s create our first pivot table.

The syntax is = pd.pivot_table(dataframe, index = Columns you want to group by, values = Columns you want to aggregate, aggfunc = type of aggregation)

pd.pivot_table(df, index='Department', values = 'AVG Labor', aggfunc = 'sum')

pandasPivotTable1.jpg

We can group by more than one column

pd.pivot_table(df, index=['Department','Model'], values = 'AVG Labor',  
aggfunc= 'mean')

pandasPivotTable2.jpg

You can also have multiple value columns

 pd.pivot_table(df, index='Department', values = ['AVG Labor','Labor Cost'],
  aggfunc= 'sum')

pandasPivotTable3.jpg

Now, one catch is, what if you want to have different aggregate functions. What if I want to get the mean of AVG Labor, but I want the sum of Labor Cost columns.

In this case we are going to use the groupby().aggregate()

import numpy as np
df.groupby('Department').aggregate({'AVG Labor':np.mean, 'Labor Cost': np.sum})

pandasPivotTable4.jpg

Please Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s