Pandas: replace numpy.nan cell with maximum of non-nan adjacent cells

  • A+
Category:Languages

test case:

df = pd.DataFrame([[np.nan, 2, np.nan, 0],                     [3, 4, np.nan, 1],                     [np.nan, np.nan, np.nan, 5],                     [np.nan, 3, np.nan, 4]],                     columns=list('ABCD')) 

where A[i + 1, j], A[i - 1, j], A[i, j + 1], A[i, j - 1] are the set of entries adjacent to A[i,j].

In so many words, this:

     A    B   C  D 0  NaN  2.0 NaN  0 1  3.0  4.0 NaN  1 2  NaN  NaN NaN  5 3  NaN  3.0 NaN  4 

should become this:

     A    B   C  D 0  3.0  2.0 2.0  0.0 1  3.0  4.0 4.0  1.0 2  3.0  4.0 5.0  5.0 3  3.0  3.0 4.0  4.0 

 


You can use the rolling method over both directions and then find the max of each. Then you can use that to fill in the missing values of the original.

df1 = df.rolling(3, center=True, min_periods=1).max().fillna(-np.inf) df2 = df.T.rolling(3, center=True, min_periods=1).max().T.fillna(-np.inf) fill = df1.where(df1 > df2).fillna(df2) df.fillna(fill) 

Output

     A    B    C  D 0  3.0  2.0  2.0  0 1  3.0  4.0  4.0  1 2  3.0  4.0  5.0  5 3  3.0  3.0  4.0  4 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: