I have an array of numbers, for example:
A = [1, 5, 2, 4, 3]
and an array that determines a rank, for example:
B = [0, 2, 1]
My goal is to find all the subarrays of A that "obey" the rank B. If a subarray obeys the rank, that means that the i-th smallest element of the subarray must have
B[i] as its (subarray) index. So for a subarray to match, the smallest element within it must be in position 0, the second smallest must be in position 2, and the biggest element must be in position 1.
So for example here, there are two subarrays of A that match the ranking: [1, 5, 2] (because A < A < A) and [2, 4, 3].
So far, I've managed to find a solution that has an
O(mn) (m is len(A) and n is len(B)) time complexity, it iterates over all the subarrays of length 3 and verifies if they are correctly ordered:
A = [1, 5, 2, 4, 3] B = [0, 2, 1] m = len(A) n = len(B) for i in range(m - n + 1): current_subarray = A[i:i + n] # we now do n - 1 comparaisons to check whether the subarray is correctly ordered. for B_index in range(n - 1): if current_subarray[B[B_index]] > current_subarray[B[B_index + 1]]: break else: print("Subarray found:", current_subarray)
Subarray found: [1, 5, 2] Subarray found: [2, 4, 3]
This works, but I was wondering if there was a better optimized algorithm (better than
O(mn)) to fulfill this task.
Instead of iterating over B to compare ranks, you could use
scipy.stats.rankdata to get the ranks directly:
from scipy.stats import rankdata A = [1, 5, 2, 4, 3] B = [0, 2, 1] m = len(A) n = len(B) for i in range(m - n + 1): current_subarray = A[i:i + n] ranked_numbers = (rankdata(current_subarray).astype(int) - 1).tolist() if ranked_numbers == B: print("Subarray found:", current_subarray) # Subarray found: [1, 5, 2] # Subarray found: [2, 4, 3]
rankdata() starts ranks from 1 instead of 0, which is why the above minuses 1 from every element in the numpy array.