Find most efficient groups of pairs

  • A+
Category:Languages

Problem

I have a group of people, and I want each person to have a 1:1 meeting with every other person in the group. A given person can only meet with one other person at a time, so I want to do the following:

  1. Find every possible pairing combination
  2. Group pairs together into "rounds" of meetings, where each person can only be in a round once, and where a round should contain as many pairs as possible to satisfy all possible pairing combinations in the smallest number of rounds.

To demonstrate the problem in terms of desired input/output, let's say I have the following list:

>>> people = ['Dave', 'Mary', 'Susan', 'John'] 

I want to produce the following output:

>>> for round in make_rounds(people): >>>     print(round) [('Dave', 'Mary'), ('Susan', 'John')] [('Dave', 'Susan'), ('Mary', 'John')] [('Dave', 'John'), ('Mary', 'Susan')] 

If I had an odd number of people, then I would expect this result:

>>> people = ['Dave', 'Mary', 'Susan'] >>> for round in make_rounds(people): >>>     print(round) [('Dave', 'Mary')] [('Dave', 'Susan')] [('Mary', 'Susan')] 

The key to this problem is that I need my solution to be performant (within reason). I've written code that works, but as the size of people grows it becomes exponentially slow. I don't know enough about writing performant algorithms to know whether my code is inefficient, or whether I'm simply bound by the parameters of the problem

What I've tried

Step 1 is easy: I can get all possible pairings using itertools.combinations:

>>> from itertools import combinations >>> people_pairs = set(combinations(people, 2)) >>> print(people_pairs) {('Dave', 'Mary'), ('Dave', 'Susan'), ('Dave', 'John'), ('Mary', 'Susan'), ('Mary', 'John'), ('Susan', 'John')} 

To work out the rounds themselves, I'm building a round like so:

  1. Create an empty round list
  2. Iterate over a copy of the people_pairs set calculated using the combinations method above
  3. For each person in the pair, check if there are any existing pairings inside the current round that already contain that individual
  4. If there's already a pair that contains one of the individuals, skip that pairing for this round. If not, add the pair to the round, and remove the pair from the people_pairs list.
  5. Once all the people pairs have been iterated over, append the round to a master rounds list
  6. Start again, since people_pairs now contains only the pairs that didn't make it into the first round

Eventually this produces the desired result, and whittles down my people pairs until there are none left and all the rounds are calculated. I can already see that this is requiring a ridiculous number of iterations, but I don't know a better way of doing this.

Here's my code:

from itertools import combinations  # test if person already exists in any pairing inside a round of pairs def person_in_round(person, round):     is_in_round = any(person in pair for pair in round)     return is_in_round  def make_rounds(people):     people_pairs = set(combinations(people, 2))     # we will remove pairings from people_pairs whilst we build rounds, so loop as long as people_pairs is not empty     while people_pairs:         round = []         # make a copy of the current state of people_pairs to iterate over safely         for pair in set(people_pairs):             if not person_in_round(pair[0], round) and not person_in_round(pair[1], round):                 round.append(pair)                 people_pairs.remove(pair)         yield round 

Plotting out the performance of this method for list sizes of 100-300 using https://mycurvefit.com shows that calculating rounds for a list of 1000 people would probably take around 100 minutes. Is there a more efficient way of doing this?

Note: I'm not actually trying to organise a meeting of 1000 people :) this is just a simple example that represents the matching / combinatorics problem I'm trying to solve.

 


When you need fast lookups hashes/dicts are the way to go. Keep track of whose been in each round in a dict rather than a list and it will be much faster.

Since you are getting in algorithms, studying big O notation will help you out and knowing what data structures are good at what kind of operations is also key. See this guide: https://wiki.python.org/moin/TimeComplexity for time complexity of Python built-ins. You'll see that checking for an item in a list is O(n) meaning that it scales linearly with the size of your inputs. So since it is in a loop you end up with O(n^2) or worse. For dicts, lookups are generally O(1) meaning that it doesn't matter the size of your input.

Also, don't override built-ins. I'd changed round to round_

from itertools import combinations  # test if person already exists in any pairing inside a round of pairs def person_in_round(person, people_dict):     return people_dict.get(person, False)  def make_rounds(people):     people_pairs = set(combinations(people, 2))     people_in_round = {}     # we will remove pairings from people_pairs whilst we build rounds, so loop as long as people_pairs is not empty     while people_pairs:         round_ = []         people_dict = {}         # make a copy of the current state of people_pairs to iterate over safely         for pair in set(people_pairs):             if not person_in_round(pair[0], people_dict) and not person_in_round(pair[1], people_dict):                 round_.append(pair)                 people_dict[pair[0]] = True                 people_dict[pair[1]] = True                   people_pairs.remove(pair)         yield round_ 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: