Replace 1s in one hot columns with values from another column

  • A+
Category:Languages

I have a data frame that looks like this:

df = pd.DataFrame({"value": [4, 5, 3], "item1": [0, 1, 0], "item2": [1, 0, 0], "item3": [0, 0, 1]}) df    value item1   item2   item3 0   4   0      1         0 1   5   1      0         0 2   3   0      0         1 

Basically what I want to do is replace the value of the one hot encoded elements with the value from the "value" column and then delete the "value" column. The resulting data frame should be like this:

df_out = pd.DataFrame({"item1": [0, 5, 0], "item2": [4, 0, 0], "item3": [0, 0, 3]})     item1    item2   item3 0   0        4      0 1   5        0      0 2   0        0      3 

 


Why not just multiply?

df.pop('value').values * df     item1  item2  item3 0      0      5      0 1      4      0      0 2      0      0      3 

DataFrame.pop has the nice effect of in-place removing and returning a column, so you can do this in a single step.


if the "item_*" columns have anything besides 1 in them, then you can multiply with bools:

df.pop('value').values * df.astype(bool)     item1  item2  item3 0      0      5      0 1      4      0      0 2      0      0      3 

If your DataFrame has other columns, then do this:

df    value  name  item1  item2  item3 0      4  John      0      1      0 1      5  Mike      1      0      0 2      3  Stan      0      0      1  # cols = df.columns[df.columns.str.startswith('item')] cols = df.filter(like='item').columns df[cols] = df.pop('value').values * df[cols]  df   name  item1  item2  item3 0  John      0      5      0 1  Mike      4      0      0 2  Stan      0      0      3 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: