Find different pair in a list by python

• A+
Category：Languages

I have a list and I want to find different pair list. I implement a function --> different()

import numpy as np   def different(array):     res = []     for (x1, y1), (x2, y2) in array:         if (x1, y1) != (x2, y2):             res.append([(x1, y1), (x2, y2)])     return res   a = np.array([[[1, 2], [3, 4]],               [[1, 2], [1, 2]],               [[7, 9], [6, 3]],               [[3, 3], [3, 3]]])  out = different(a)  # get [[(1, 2), (3, 4)],                     #      [(7, 9), (6, 3)]]

Is there any other better way to do it? I want to improve my function different. List size may be greater than 100,000.

Solutions time comparisons

When there are so many different approaches to a problem, time comparisons can really help sort out the better answers.

Setup

We use an array of size (200000, 2, 2) as OP Vincentlai pointed out that is in the range of the expected array size.

a = np.array(np.random.randint(10, size=(200000, 2, 2)))

Using Joe answer: numpy.logical_and

%timeit b = a[np.logical_and(a[:,0,0] != a[:,1,0],  a[:,0,1] != a[:,1,1])] >>> 5.12 ms ± 110 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Using Coldspeed first answer: vectorised comparison

%timeit b = a[~(a[:, 0] == a[:, 1]).all(1)] >>> 13.7 ms ± 559 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Using Coldspeed second answer: numpy.logical_or

%timeit b = a[np.logical_or.reduce(a[:, 0] != a[:, 1], axis=1)] >>> 13.2 ms ± 498 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Using U9 Forward answer: filters

%timeit b = list(filter(lambda x: x!=x,a.tolist())) >>> 102 ms ± 4.02 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Using aydow answer: filters

%timeit b = [[(x1, y1), (x2, y2)] for (x1, y1), (x2, y2) in a if (x1, y1) != (x2, y2)] >>> 752 ms ± 11.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Conclusions

Joe's approach with numpy.logical_and is by far the faster one. Predictably, every full python approach falls extremely short to anything numpy.