Python 3 pandas.groupby.filter

  • A+
Category:Languages

I am trying to perform a groupby filter that is very similar to the example in this documentation: pandas groupby filter

>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar', ...                           'foo', 'bar'], ...                    'B' : [1, 2, 3, 4, 5, 6], ...                    'C' : [2.0, 5., 8., 1., 2., 9.]}) >>> grouped = df.groupby('A') >>> grouped.filter(lambda x: x['B'].mean() > 3.)      A  B    C 1  bar  2  5.0 3  bar  4  1.0 5  bar  6  9.0 

I am trying to return a DataFrame that has all 3 columns, but only 2 rows. Those 2 rows contain the minimum values of column B, after grouping by column A. I tried the following line of code:

grouped.filter(lambda x: x['B'] == x['B'].min()) 

But this doesn't work, and I get this error: TypeError: filter function returned a Series, but expected a scalar bool

The DataFrame I am trying to return should look like this:

    A   B   C 0  foo  1  2.0 1  bar  2  5.0 

I would appreciate any help you can provide. Thank you, in advance, for your help.

 


>>> df.loc[df.groupby('A')['B'].idxmin()]       A  B    C 1  bar  2  5.0 0  foo  1  2.0 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: