Percentage of Total with Groupby in Pandas
This article focuses on a common task in data analysis: calculating the percentage of a total for each group within a DataFrame. Here's the question and a solution using Pandas.
Question:
I have a CSV file with columns State, Office ID, and Sales. I want to calculate the percentage of sales per office in a given state, where the total of all percentages in each state is 100%.
Solution:
To achieve this, we need to create a second groupby level.
Firstly, we can create a groupby object based on the 'state' and 'office_id' columns:
import pandas as pd state_office = df.groupby(['state', 'office_id']).agg({'sales': 'sum'})
Then, we create a new DataFrame by dividing each sales value by its corresponding sum by state:
state_pcts = state_office.groupby(level=0).apply(lambda x: 100 * x / float(x.sum()))
This approach keeps track of the sum of sales per state while ensuring the percentage of each group within a state adds up to 100%.
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3