Large amount of lists concatenation

  • A+
Category:Languages

I'm trying to make a function which concatenate multiple list if one element is the same in 2 or more different list.

Example :

[[1,2],[3,4,5],[0,4]] would become [[1,2],[0,3,4,5]

[[1],[1,2],[0,2]] would become [[0,1,2]]

[[1, 2], [2, 3], [3, 4]] would become [[1,2,3,4]]

In fact we just regroup the list if they have a common element and we delete one of the two element. The finals lists must have unique elements.

I tried to make the following function. It works, but when using big list (around 100 or 200 lists of list), I got the following recursion error : RecursionError: maximum recursion depth exceeded while getting the repr of an object

def concat(L):    break_cond = False    print(L)    for L1 in L:        for L2 in L:            if (bool(set(L1) & set(L2)) and L1 != L2):                break_cond = True    if (break_cond):        i, j = 0, 0        while i < len(L):            while j < len(L):                if (bool(set(L[i]) & set(L[j])) and i != j):                    L[i] = sorted(L[i] + list(set(L[j]) - set(L[i])))                    L.pop(j)                j += 1            i += 1        return concat(L) 

Moreover, I would like to do it using only basic python and not that much library. Any idea ? Thanks

Example of list where I get the error :

[[0, 64], [1, 120, 172], [2, 130], [3, 81, 102], [5, 126], [6, 176], [7, 21, 94], [8, 111, 167], [9, 53, 60, 138], [10, 102, 179], [11, 45, 72], [12, 53, 129], [14, 35, 40, 58, 188], [15, 86], [18, 70, 94], [19, 28], [20, 152], [21, 24], [22, 143, 154], [23, 110, 171], [24, 102, 144], [25, 73, 106, 187], [26, 189], [28, 114, 137], [29, 148], [30, 39], [31, 159], [33, 44, 132, 139], [34, 81, 100, 136, 185], [35, 53], [37, 61, 138], [38, 144, 147, 165], [41, 42, 174], [42, 74, 107, 162], [43, 99, 123], [44, 71, 122, 126], [45, 74, 144], [47, 94, 151], [48, 114, 133], [49, 130, 144], [50, 51], [51, 187], [52, 124, 142, 146, 167, 184], [54, 97], [55, 94], [56, 88, 128, 166], [57, 63, 80], [59, 89], [60, 106, 134, 142], [61, 128, 145], [62, 70], [63, 73, 76, 101, 106], [64, 80, 176], [65, 187, 198], [66, 111, 131, 150], [67, 97, 128, 159], [68, 85, 128], [69, 85, 169], [70, 182], [71, 123], [72, 85, 94], [73, 112, 161], [74, 93, 124, 151, 191], [75, 163], [76, 99, 106, 129, 138, 152, 179], [77, 89, 92], [78, 146, 156], [79, 182], [82, 87, 130, 179], [83, 148], [84, 110, 146], [85, 98, 137, 177], [86, 198], [87, 101], [88, 134, 149], [89, 99, 107, 130, 193], [93, 147], [95, 193], [96, 98, 109], [104, 105], [106, 115, 154, 167, 190], [107, 185, 193], [111, 144, 153], [112, 128, 188], [114, 136], [115, 146], [118, 195], [119, 152], [121, 182], [124, 129, 177], [125, 156], [126, 194], [127, 198], [128, 149], [129, 153], [130, 164, 196], [132, 140], [133, 181], [135, 165, 170, 171], [136, 145], [141, 162], [142, 170, 187], [147, 171], [148, 173], [150, 180], [153, 191], [154, 196], [156, 165], [157, 177], [158, 159], [159, 172], [161, 166], [162, 192], [164, 184, 197], [172, 199], [186, 197], [187, 192]] 

 


As mentioned by @ScottBoston this is a graph problem, known as connected components, I suggest you used networkx as indicated by @ScottBoston, in case you cannot here is a version without networkx:

from itertools import combinations   def bfs(graph, start):     visited, queue = set(), [start]     while queue:         vertex = queue.pop(0)         if vertex not in visited:             visited.add(vertex)             queue.extend(graph[vertex] - visited)     return visited   def connected_components(G):     seen = set()     for v in G:         if v not in seen:             c = set(bfs(G, v))             yield c             seen.update(c)   def graph(edge_list):     result = {}     for source, target in edge_list:         result.setdefault(source, set()).add(target)         result.setdefault(target, set()).add(source)     return result   def concat(l):     edges = []     s = list(map(set, l))     for i, j in combinations(range(len(s)), r=2):         if s[i].intersection(s[j]):             edges.append((i, j))     G = graph(edges)      result = []     unassigned = list(range(len(s)))     for component in connected_components(G):         union = set().union(*(s[i] for i in component))         result.append(sorted(union))         unassigned = [i for i in unassigned if i not in component]      result.extend(map(sorted, (s[i] for i in unassigned)))      return result   print(concat([[1, 2], [3, 4, 5], [0, 4]])) print(concat([[1], [1, 2], [0, 2]])) print(concat([[1, 2], [2, 3], [3, 4]])) 

Output

[[0, 3, 4, 5], [1, 2]] [[0, 1, 2]] [[1, 2, 3, 4]] 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: