I have a table that looks like this:
A B 1 cat 1 cat 1 dog 2 illama 2 alpaca 3 donkey
A B 1 cat 3 donkey
1 is duplicated three times, the value
cat occurs the most so it is recorded. there is no majority for
2 so it is considered ambiguous and removed completely.
3 remains as it has no duplicate.
This is a two step solution using
# find the mode for each group i = df.groupby('A').B.apply(pd.Series.mode).reset_index(level=1, drop=True) # filter out groups which have more than one mode—ambiguous groups j = i[i.groupby(level=0).transform('count') == 1].reset_index()
print(j) A B 0 1 cat 1 3 donkey
Alternatively, define a custom function that computes the mode and call it with
apply. The filtration logic is subsumed into the function.
def foo(x): m = pd.Series.mode(x) if len(m) == 1: return m df.groupby('A').B.apply(foo).reset_index(level=1, drop=True).reset_index() A B 0 1 cat 1 3 donkey