Creating a Column Based on Conditional Logic in Python
When working with Pandas DataFrames, we often encounter scenarios where we need to create a new column based on a conditional check between existing columns. This can be achieved using the np.where function with nested conditions.
To illustrate, consider the following DataFrame:
import pandas as pd
df = pd.DataFrame({
"A": [2, 3, 1],
"B": [2, 1, 3]
})
We want to create a new column C based on the following criteria:
Using a Custom Function
One approach is to create a custom function that implements the conditional logic and apply it to the DataFrame:
def f(row):
if row['A'] == row['B']:
return 0
elif row['A'] > row['B']:
return 1
else:
return -1
df['C'] = df.apply(f, axis=1)
Using np.where
Alternatively, we can use the np.where function to directly assign values to the new column:
df['C'] = np.where(df['A'] == df['B'], 0, np.where(df['A'] > df['B'], 1, -1))
This approach is vectorized and more efficient for large datasets.
Result:
Both approaches produce the following result:
print(df)
A B C
0 2 2 0
1 3 1 1
2 1 3 -1
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3