How to Pivot a DataFrame or Table in Python Pandas?

Discover how to pivot a dataframe or table in Python Pandas

Facing issues while trying to pivot a DataFrame or a table in Python Pandas? We’ve got the solution for you!

Manipulating the data according to your requirements is an art at which only a few are adept. Interchanging rows to columns or columns to rows might sound easy on paper. However, it can feel confusing once you get to do it yourself.

You try to give up on the idea of using DataFrames and default back to standard tables — yet the problem persists 😠. 

Considering data manipulation is an intricate art, you’d need to be pretty adept at Pandas to make it work.

How to pivot a dataframe or table in Python Pandas

 

According to Pandas, most data formats can automatically be reshaped using the pivot function. However, knowing what arguments and parameters to pass is also essential since there’s no telling how the data will look otherwise 🤔.

Hence, it’s recommended you master the generic pivot function from Pandas instead of implementing a custom copy-paste loop. This function generally works in the following way:

  • Taking the given index and column values to identify the point of pivot. 
  • Interchanging the axes between the row-column format to produce a pivot table. 

Having said that, this article will primarily focus on pivoting complete DataFrames and Tables with the help of the Pandas library in Python.

 

Note: This guide will list the pivoting technique using the built-in functionality of the Pandas library. Therefore, there’s no need to look for manual pivoting methods. Besides, the flexibility of the Python programming language allows you to implement pivots in any direction. 

 

 

How to Pivot a DataFrame in Pandas Python?

1. Understanding the Format

A DataFrame in Python can be pivoted by using the following function:

DataFrame.pivot(index=None, columns=None, values=None)

 

The DataFrame object’s pivot function takes in a total of three arguments that control the following operations 🤓:

  • index: This parameter takes in the index to form axes of the resulting DataFrame. 
  • columns: This parameter serves a similar function as the index but takes in the names of the columns instead. 
  • values: These determine how the new indexing is going to turn out. Since this function does not support data aggregation, you will get a MultiIndex in columns — if additional values are passed. 

 

This return-based function serves the pivoted list once it’s called. Therefore, be sure to link the reference with a variable:

Variable_name =  DataFrame.pivot(index=label_1, columns=label_2, values=label_3)

 

 

2. Making It Work

With that said, here’s a working example of how the pivot will turn out or operate on a sample DataFrame:

  1. To start, let’s create a sample DataFrame to play around with,
import pandas as pd

DataFrame = pd.DataFrame(

   {

       'words': ['one', 'one', 'one', 'two', 'two', 'two'],

       'upperalpha': ['A', 'B', 'C', 'A', 'B', 'C'],

       'numbers': [1, 2, 3, 4, 5, 6],

       'loweralpha': ['x', 'y', 'z', 'q', 'w', 't']

   }

)

 

  1. Next, call the pivot function to initiate a Pandas DataFrame pivot with words as an index, upperalpha as columns, and numbers as values, and store it in a variable:
variable = DataFrame.pivot(index='words', columns='upperalpha', values='numbers')

 

  1. Upon execution, you will end up with the following output:
upperalpha  A  B  C

words              

one         1  2  3

two         4  5  6

 

 

How to Pivot a Table in Pandas Python?

1. Understanding the Format

A table in Python can be pivoted by using the following function from the Pandas library:

pandas.pivot_table(data, values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All', observed=False, sort=True)

 

The pivot_table function can take in a boatload of arguments 🤯 that control the resultant formation of the pivoted table. Here’s how each argument works:

  1. data: Refers to the DataFrame object being passed since the function is library-specific. 
  2. values: Allows you to aggregate columns, unlike the Pandas DataFrame pivot.
  3. index: Allows you to pass a column, Grouper, array, or a list as long as it’s the same length as the data parameter (strictly in the case of an array). 
  4. columns: Pass a column, Grouper, array, or a list of the previous. However, the array must be the same length as the data parameter.
  5. aggfunc: Pass a function, a list of multiple functions, a dictionary, or a default numpy.mean. If a function list is passed as a parameter, the resulting pivot table will form a hierarchy of columns. In the case of a dictionary, the key will become the column to aggregate, while the value becomes a function or a list of functions. 
  6. fill_value: This can take in a scalar value to replace any missing values in the resultant pivot table. This value is replaced only after the table is aggregated.
  7. margins: Takes in a boolean and adds all rows and columns to calculate a total.
  8. dropna: As the name suggests, it takes in a boolean and drops all the NaN columns from the resultant table.
  9. margins_name: These take in a string to allow you to specify the extent of the pivot by mentioning the name of the row/column containing the totals.
  10. observed: If the groupers are categorical, only the observed values will be shown. This is a boolean parameter as well.
  11. sort: Takes a boolean in to specify whether the resultant pivot values should be sorted or not.

 

This is also a return-based function that dishes out an excel style pivot table once executed, so make sure to reference it to a variable 😮‍💨

variable_name = pandas.pivot_table(data, values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All', observed=False, sort=True)

 

 

2. Making It Work

With that said, here’s a working example of the table pivot in Pandas Python:

  1. To start, let’s create a sample Pandas DataFrame, and import the dependencies:
import pandas as pd

import numpy as np

DataFrame = pd.DataFrame(

   {

       'words': ['one', 'one', 'one', 'two', 'two', 'two'],

       'upperalpha': ['A', 'B', 'C', 'A', 'B', 'C'],

       'numbers': [1, 2, 3, 4, 5, 6],

       'loweralpha': ['x', 'y', 'z', 'q', 'w', 't']

   }

)

 

  1. Next, apply the function with your desired parameters:
variable = pd.pivot_table(DataFrame, values='words', index=['upperalpha'],

                   columns=['numbers'], aggfunc=np.sum)

 

  1. Then, you will end up with the following pivoted table upon execution:
numbers       1    2    3    4    5    6

upperalpha                              

A           one  NaN  NaN  two  NaN  NaN

B           NaN  one  NaN  NaN  two  NaN

C           NaN  NaN  one  NaN  NaN  two

 

 

Conclusion

In essence, the Pandas library features built-in functions for both DataFrame pivot and table pivot. However, you have the liberty to be a bit more controlling in the latter. Unfortunately, this also means that the table function can feel slightly more complex than the fundamental pivot.

With that in mind, we’ve kept the guide as basic and down-to-earth as possible, so you won’t have difficulty implementing Pandas DataFrame and table pivots.

Before running the script, re-verify the arguments to ensure one coincides with the other 😌

If the mentioned examples don’t work out for you, update your Pandas library to the latest version. 

Lastly, let us know in the comments:

  • Were you able to pivot your table or DataFrame in Pandas Python?
  • Is there a simpler way to pivot DataFrames?
  • Which kind of your current projects require pivots?
  • Are there any points you believe we should’ve mentioned?

Feel free to share this post with your friends and peers, so they don’t have to struggle with table and DataFrame pivots with Pandas using Python. 

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts
Total
0
Share