Get the index of the first previous smaller value

  • A+
Category:Languages

I have a dataframe that looks like this:

index value 0     1 1     1 2     2 3     3 4     2 5     1 6     1 

what I want is for each value to return the index of the previous smaller value, and, in addition, the index of the previous "1" value. If the value is 1 I don't need them (both values can be -1 or something).

So what I'm after is:

index value  previous_smaller_index  previous_1_index 0     1            -1                      -1 1     1            -1                      -1 2     2             1                       1 3     3             2                       1 4     2             1                       1 5     1            -1                      -1 6     1            -1                      -1 

I tried using rolling, cumulative functions etc. but I couldn't figure it out. Any help would be appreciated!

Edit: SpghttCd already provided a nice solution for the "previous 1" problem. I'm looking for a nice pandas one liner for the "previous small" problem. (even though, of course, more nice and efficient solutions are welcomed for both problems)

 


  • "previous_smaller_index" can be found using vectorised numpy broadcasted comparison with argmax.

  • "previous_1_index" can be solved using groupby and idxmax on a cumsummed mask.

m = df.value.eq(1) u = np.triu(df.value.values < df.value[:,None]).argmax(1) v = m.cumsum()  df['previous_smaller_index'] = np.where(m, -1, len(df) - u - 1) df['previous_1_index'] = v.groupby(v).transform('idxmax').mask(m, -1) 

df    index  value  previous_smaller_index  previous_1_index 0      0      1                      -1                -1 1      1      1                      -1                -1 2      2      2                       1                 1 3      3      3                       2                 1 4      4      2                       1                 1 5      5      1                      -1                -1 6      6      1                      -1                -1 

If you want these as one liners, you can scrunch a few lines together into one:

m = df.value.eq(1) df['previous_smaller_index'] = np.where(     m, -1, len(df) - np.triu(df.value.values < df.value[:,None]).argmax(1) - 1 )[::-1]  # Optimizing @SpghttCd's `previous_1_index` calculation a bit df['previous_1_index'] = (np.where(     m, -1, df.index.where(m).to_series(index=df.index).ffill(downcast='infer')) )  df     index  value  previous_1_index  previous_smaller_index 0      0      1                -1                      -1 1      1      1                -1                      -1 2      2      2                 1                       1 3      3      3                 1                       2 4      4      2                 1                       1 5      5      1                -1                      -1 6      6      1                -1                      -1 

Overall Performance

Setup and performance benchmarking was done using perfplot. The code can be found at this gist.

Get the index of the first previous smaller value

Timings are relative (the y-scale is logarithmic).


previous_1_index Performance

Gist with relevant code.

Get the index of the first previous smaller value

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: