 A+
I have a dataframe that looks like this:
index value 0 1 1 1 2 2 3 3 4 2 5 1 6 1
what I want is for each value to return the index of the previous smaller value, and, in addition, the index of the previous "1" value. If the value is 1 I don't need them (both values can be 1
or something).
So what I'm after is:
index value previous_smaller_index previous_1_index 0 1 1 1 1 1 1 1 2 2 1 1 3 3 2 1 4 2 1 1 5 1 1 1 6 1 1 1
I tried using rolling, cumulative functions etc. but I couldn't figure it out. Any help would be appreciated!
Edit: SpghttCd already provided a nice solution for the "previous 1" problem. I'm looking for a nice pandas one liner for the "previous small" problem. (even though, of course, more nice and efficient solutions are welcomed for both problems)

"previous_smaller_index" can be found using vectorised numpy broadcasted comparison with
argmax
. 
"previous_1_index" can be solved using
groupby
andidxmax
on acumsum
med mask.
m = df.value.eq(1) u = np.triu(df.value.values < df.value[:,None]).argmax(1) v = m.cumsum() df['previous_smaller_index'] = np.where(m, 1, len(df)  u  1) df['previous_1_index'] = v.groupby(v).transform('idxmax').mask(m, 1)
df index value previous_smaller_index previous_1_index 0 0 1 1 1 1 1 1 1 1 2 2 2 1 1 3 3 3 2 1 4 4 2 1 1 5 5 1 1 1 6 6 1 1 1
If you want these as one liners, you can scrunch a few lines together into one:
m = df.value.eq(1) df['previous_smaller_index'] = np.where( m, 1, len(df)  np.triu(df.value.values < df.value[:,None]).argmax(1)  1 )[::1] # Optimizing @SpghttCd's `previous_1_index` calculation a bit df['previous_1_index'] = (np.where( m, 1, df.index.where(m).to_series(index=df.index).ffill(downcast='infer')) ) df index value previous_1_index previous_smaller_index 0 0 1 1 1 1 1 1 1 1 2 2 2 1 1 3 3 3 1 2 4 4 2 1 1 5 5 1 1 1 6 6 1 1 1
Overall Performance
Setup and performance benchmarking was done using perfplot
. The code can be found at this gist.
Timings are relative (the yscale is logarithmic).
previous_1_index
Performance