Reduce multi-index/multi-level dataframe to single index, single level

  • A+
Category:Languages

Say I have a dataframe that looks like this:

>>> df    Year   MPG VehicleType FuelType 0  2000  20.5         Car      Gas 1  2009  22.3         Car      Gas 2  2017  50.9         Car      Gas 3  2000  14.7         Car   Diesel 4  2009  18.0         Car   Diesel 5  2017  22.2         Car   Diesel 

I need to split the VehicleType column into two columns based on the value of the FuelType column, using the Year column as an index. I used pivot_table to split the columns correctly.

>>> pd.pivot_table(df, columns=['VehicleType', 'FuelType'], values='MPG', index=['Year']) VehicleType    Car       FuelType    Diesel   Gas Year                     2000          14.7  20.5 2009          18.0  22.3 2017          22.2  50.9 

This is great and all, but it results in a multi-indexed dataframe, which for my purposes I don't want.

I am trying to get a result that looks something like this:

Year Car_Diesel_MPG Car_Gas_MPG 2000           14.7        20.5 2009           18.0        22.3 2017           22.2        50.9 

My efforts at trying to achieve this have resulted in some pretty nasty looking code. Is there a simple way to do this?

 


Using set_axis, map, and join

df2.set_axis(df2.columns.map('_'.join), axis=1, inplace=False).add_suffix('_MPG')        Car_Diesel_MPG  Car_Gas_MPG Year                              2000            14.7         20.5 2009            18.0         22.3 2017            22.2         50.9 

groupby with a dict

m = {t: '_'.join(t) for t in df2.columns} df2.groupby(m, axis=1).mean().add_suffix('_MPG')        Car_Diesel_MPG  Car_Gas_MPG Year                              2000            14.7         20.5 2009            18.0         22.3 2017            22.2         50.9 

Either of these can reset_index

m = {t: '_'.join(t) for t in df2.columns} df2.groupby(m, axis=1).mean().add_suffix('_MPG').reset_index()     Year  Car_Diesel_MPG  Car_Gas_MPG 0  2000            14.7         20.5 1  2009            18.0         22.3 2  2017            22.2         50.9 

groupby instead of pivot_table

df.groupby(     ['Year', df.VehicleType.str.cat(df.FuelType, sep='_').add('_MPG').values] ).MPG.sum().unstack().reset_index()     Year  Car_Diesel_MPG  Car_Gas_MPG 0  2000            14.7         20.5 1  2009            18.0         22.3 2  2017            22.2         50.9 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: