Groupby sum and count on multiple columns under multiple conditions in Python

  • A+
Category:Languages

I have a pandas dataframe that looks like this:

import pandas as pd import numpy as np data = {     "Type": ["A", "A", "B", "B", "B"],     "Project": ["X123", "X123", "X21", "L31", "L31"],     "Number": [100, 300, 100, 200, 500],     "Status": ['Y', 'Y', 'N', 'Y', 'N'] } df = pd.DataFrame.from_dict(data) 

I want to group by Type and get count and sum with several conditions and get results as follows:

Type  Total_Count  Total_Number  Count_Status=Y  Number_Status=Y  Count_Status=N  Number_Status=N   A        2          400              2               400              0               0  B        5          800              1               200              2              600 

I have tried following but not exactly what i need. Please share any ideas that you might have. Thanks!

df1 = pd.pivot_table(df, index = 'Type', values = 'Number', aggfunc = np.sum) df2 = pd.pivot_table(df, index = 'Type', values = 'Project', aggfunc = 'count') pd.concat([df1, df2], axis=1) 

 


If you want to create a Function:

def my_agg(x):     names = {         'Total_Count': x['Type'].count(),         'Total_Number': x['Number'].sum(),         'Count_Status=Y': x[x['Status']=='Y']['Type'].count(),         'Number_Status=Y': x[x['Status']=='Y']['Number'].sum(),         'Count_Status=N': x[x['Status']=='N']['Type'].count(),         'Number_Status=N': x[x['Status']=='N']['Number'].sum()}      return pd.Series(names)  df.groupby('Type').apply(my_agg)      Total_Count   Total_Number  Count_Status=Y  Number_Status=Y Count_Status=N  Number_Status=N Type                         A      2           400                2                400            0             0 B      3           800                1                200            2            600 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: