In data analysis, it is often useful to bin data into categories to simplify its representation and analysis. This is a common technique when working with numeric data, such as when dealing with percentages.
Suppose we have a data frame column named "percentage" containing numeric values, as shown below:
df['percentage'].head() 46.5 44.2 100.0 42.12
To bin this column and get the value counts for each bin, we can use the pd.cut function. Here are two ways to achieve this:
Using pd.cut with value_counts:
bins = [0, 1, 5, 10, 25, 50, 100] df['binned'] = pd.cut(df['percentage'], bins) print(df.groupby(df['binned']).size())
Using np.searchsorted and groupby:
df['binned'] = np.searchsorted(bins, df['percentage'].values) print(df.groupby(df['binned']).size())
Both methods will return the following output:
percentage (0, 1] 0 (1, 5] 0 (5, 10] 0 (10, 25] 0 (25, 50] 3 (50, 100] 1 dtype: int64
This output indicates that there are no values in the bins (0, 1], (1, 5], (5, 10], and (10, 25]. Three values fall in the bin (25, 50], and one value falls in the bin (50, 100].
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3