I have a dataframe with 3 columns in Python:
Name1 Name2 Value Juan Ale 1 Ale Juan 1
and would like to eliminate the duplicates based on columns Name1 and Name2 combinations.
In my example both rows are equal (but they are in different order), and I would like to delete the second row and just keep the first one, so the end result should be:
Name1 Name2 Value Juan Ale 1
Any idea will be really appreciated!
You can convert to
frozenset and use
res = df[~df[['Name1', 'Name2']].apply(frozenset, axis=1).duplicated()] print(res) Name1 Name2 Value 0 Juan Ale 1
frozenset is necessary instead of
duplicated uses hashing to check for duplicates.