pandas filling nans by mean of before and after non-nan values

  • A+
Category:Languages

I would like to fill df's nan with an average of adjacent elements.

Consider a dataframe:

df = pd.DataFrame({'val': [1,np.nan, 4, 5, np.nan, 10, 1,2,5, np.nan, np.nan, 9]})     val 0   1.0 1   NaN 2   4.0 3   5.0 4   NaN 5   10.0 6   1.0 7   2.0 8   5.0 9   NaN 10  NaN 11  9.0 

My desired output is:

    val 0   1.0 1   2.5 2   4.0 3   5.0 4   7.5 5   10.0 6   1.0 7   2.0 8   5.0 9   7.0 <<< deadend 10  7.0 <<< deadend 11  9.0 

I've looked into other solutions such as Fill cell containing NaN with average of value before and after, but this won't work in case of two or more consecutive np.nans.

Any help is greatly appreciated!

 


Use ffill + bfill and divide by 2:

df = (df.ffill()+df.bfill())/2  print(df)      val 0    1.0 1    2.5 2    4.0 3    5.0 4    7.5 5   10.0 6    1.0 7    2.0 8    5.0 9    7.0 10   7.0 11   9.0 

EDIT : If 1st and last element contains NaN then use (Dark suggestion):

df = pd.DataFrame({'val':[np.nan,1,np.nan, 4, 5, np.nan,                            10, 1,2,5, np.nan, np.nan, 9,np.nan,]}) df = (df.ffill()+df.bfill())/2 df = df.bfill().ffill()  print(df)      val 0    1.0 1    1.0 2    2.5 3    4.0 4    5.0 5    7.5 6   10.0 7    1.0 8    2.0 9    5.0 10   7.0 11   7.0 12   9.0 13   9.0 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: