Reshaping Data from Long to Wide in Pandas: A Comprehensive Guide
Many datasets are initially stored in long format, where each row represents a single observation and multiple variables are listed as columns. However, it often becomes necessary to reshape the data into wide format, where each row corresponds to a unique combination of values from two or more variables.
Issue: Transforming data from long to wide format can be a cumbersome task in Pandas, especially when using the melt/stack/unstack methods. For instance, consider the following long-format dataframe:
import pandas as pd
data = pd.DataFrame({
'Salesman': ['Knut', 'Knut', 'Knut', 'Steve'],
'Height': [6, 6, 6, 5],
'product': ['bat', 'ball', 'wand', 'pen'],
'price': [5, 1, 3, 2]
})
Reshaping to Wide Format:
To reshape the data into wide format, we can utilize Chris Albon's solution:
Create Long Dataframe:
raw_data = {
'patient': [1, 1, 1, 2, 2],
'obs': [1, 2, 3, 1, 2],
'treatment': [0, 1, 0, 1, 0],
'score': [6252, 24243, 2345, 2342, 23525]
}
df = pd.DataFrame(raw_data, columns=['patient', 'obs', 'treatment', 'score'])
Reshape to Wide:
df.pivot(index='patient', columns='obs', values='score')
This will generate the desired wide-format dataframe:
obs 1 2 3
patient
1 6252.0 24243.0 2345.0
2 2342.0 23525.0 NaN
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3