Pandas: Selecting and modifying dataframe based on even more complex criteria

  • A+
Category:Languages

I was looking at this and this threads, and though my question is not so different, it has a few differences. I have a dataframe full of floats, that I want to replace by strings. Say:

      A     B       C  A    0     1.5     13  B    0.5   100.2   7.3  C    1.3   34      0.01 

To this table I want to replace by several criteria, but only the first replacement works:

df[df<1]='N' # Works df[(df>1)&(df<10)]#='L' # Doesn't work df[(df>10)&(df<50)]='M'  # Doesn't work df[df>50]='H'  # Doesn't work 

If I instead do the selection for the 2nd line based on float, still doesn't work:

((df.applymap(type)==float) & (df<10) & (df>1)) #Doesn't work 

I was wondering how to apply pd.DataFrame().mask in here, or any other way. How should I solve this?

Alternatively, I know I may read column by column and apply the substitutions on each series, but this seems a bit counter productive

Edit: Could anyone explain why the 4 simple assignments above do not work?

 


Use numpy.select with DataFrame constructor:

m1 = df < 1 m2 = (df>1)&(df<10) m3 = (df>10)&(df<50) m4 = df>5  vals = list('NLMH')  df = pd.DataFrame(np.select([m1,m2,m3,m4], vals), index=df.index, columns=df.columns) print (df)    A  B  C A  N  L  M B  N  H  L C  L  M  N 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: