Number of unique pairs within one column – pandas

  • A+

I am having a little problem with producing statistics for my dataframe in pandas. My dataframe looks like this (I omit the index):

id    type   1      A 2      B 3      A 1      B 3      B 2      C 4      B 4      C 

What is important, each id has two type values assigned, as can be seen from the example above. I want to count all type combinations occurrences (so count number of unique id with given type combination), so I want to get such a dataframe:

type    count A, B      2 A, C      0 B, C      2 

I tried using groupby in many ways, but in vain. I can do this kind of 'count' using for-loop and a number of lines of code, but I believe there has to be elegant and proper (in python terms) solution to this problem.

Thanks in advance for any hints.


Using GroupBy + apply with value_counts:

from itertools import combinations  def combs(types):     return pd.Series(list(combinations(sorted(types), 2)))  res = df.groupby('id')['type'].apply(combs).value_counts()  print(res)  (A, B)    2 (B, C)    2 Name: type, dtype: int64 


:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: