# Conditionally offseting values by group with Pandas

• A+
Category：Languages

I am looking for a more efficient and maintainable way to offset values conditionally by group. Easiest to show an example.

Value is always non-negative for `Offset == False` and always negative for `Offset == True`. What I'm looking to do is "collapse" positive Values (flooring at 0) against negative ones by Label.

Note `Label` + `Offset` combined are always unique. Since `Offset` is Boolean, you can only have a maximum of 2 rows per Label.

Example 1

``df = pd.DataFrame({'Label': ['L1', 'L2', 'L3', 'L3'],                    'Offset': [False, False, False, True],                    'Value': [100, 100, 50, -100]})  # input #   Label Offset  Value # 0    L1  False    100 # 1    L2  False    100 # 2    L3  False     50 # 3    L3   True   -100 ``

Desired output:

``  Label Offset  Value 0    L1  False    100 1    L2  False    100 2    L3  False      0 3    L3   True    -50 ``

Example 2

``df = pd.DataFrame({'Label': ['L1', 'L2', 'L3', 'L3'],                    'Offset': [False, False, False, True],                    'Value': [100, 100, 100, -50]})  # input #   Label Offset  Value # 0    L1  False    100 # 1    L2  False    100 # 2    L3  False    100 # 3    L3   True    -50 ``

Desired output:

``  Label Offset  Value 0    L1  False    100 1    L2  False    100 2    L3  False     50 3    L3   True      0 ``

Current inefficient solution

My current solution is a manual loop which is slow and difficult to maintain:

``for label in df['Label'].unique():     mask = df['Label'] == label     if len(df.loc[mask]) == 2:         val_false = df.loc[~df['Offset'] & mask, 'Value'].iloc[0]         val_true = df.loc[df['Offset'] & mask, 'Value'].iloc[0]         if val_false > abs(val_true):             df.loc[~df['Offset'] & mask, 'Value'] += val_true             df.loc[df['Offset'] & mask, 'Value'] = 0         else:             df.loc[~df['Offset'] & mask, 'Value'] = 0             df.loc[df['Offset'] & mask, 'Value'] += val_false ``

I'm looking for a vectorised, or at least partially vectorised, solution to improve performance and get rid of this mess.

Maybe:

``label_sums = df.Value.groupby(df.Label).transform(sum) df["new_sum"] = label_sums.where(np.sign(label_sums) == np.sign(df.Value), 0) ``

which gives me

``In [42]: df Out[42]:    Label  Offset  Value  new_sum 0    L1   False    100      100 1    L2   False    100      100 2    L3   False     50        0 3    L3    True   -100      -50 4    L4   False    100      100 5    L5   False    100      100 6    L6   False    100       50 7    L6    True    -50        0 ``