"If a worker wants to do his job well, he must first sharpen his tools." - Confucius, "The Analects of Confucius. Lu Linggong"
Front page > Programming > How to Pivot a Dataframe Using Pandas?

How to Pivot a Dataframe Using Pandas?

Published on 2024-11-14
Browse:829

How to Pivot a Dataframe Using Pandas?

How to Pivot a Dataframe Using Pandas

Reshaping tabular data is an essential task in data analysis. Pivoting, a technique for transposing rows and columns in a dataframe, is often useful for creating pivot tables and exploring data from different perspectives. Let's explore how to perform this operation in Pandas, a powerful data manipulation library.

To pivot a dataframe, primarily use the .pivot method. This method takes several arguments:

  1. index: Specifies the column(s) to become the index of the pivoted dataframe.
  2. columns: Indicates the column(s) to become the column headers of the pivoted dataframe.
  3. values: Denotes the column(s) whose values should be used to populate the pivot table.

For example, consider the following dataframe:

Indicator  Country  Year  Value
1          Angola   2005  6
2          Angola   2005  13
3          Angola   2005  10
4          Angola   2005  11
5          Angola   2005  5
1          Angola   2006  3
2          Angola   2006  2
3          Angola   2006  7
4          Angola   2006  3
5          Angola   2006  6

To pivot this dataframe so that the values in the Indicator column become the new columns, use the following code:

out = df.pivot(index=['Country', 'Year'], columns='Indicator', values='Value')
print(out)

This operation will produce the following pivoted dataframe:

Indicator     1   2   3   4  5
Country Year
Angola  2005  6  13  10  11  5
        2006  3   2   7   3  6

To convert the pivoted dataframe back to a flat table, use .rename_axis to remove the Indicator axis and .reset_index to convert Country and Year back to normal columns.

print(out.rename_axis(columns=None).reset_index())

This will result in the original dataframe structure:

  Country  Year  1   2   3   4  5
0  Angola  2005  6  13  10  11  5
1  Angola  2006  3   2   7   3  6

If your data contains duplicate combinations of labels (e.g., Country, Year, Indicator), use .pivot_table. This method takes the mean by default.

out = df.pivot_table(
    index=['Country', 'Year'],
    columns='Indicator',
    values='Value')
print(out.rename_axis(columns=None).reset_index())

This will output a similar pivoted dataframe, but with mean values for duplicate combinations.

For a more detailed overview, refer to the Pandas user guide on Reshaping and pivot tables.

Latest tutorial More>

Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.

Copyright© 2022 湘ICP备2022001581号-3