No. of distinct subsequences of length 3 in an array of length n

  • A+
Category:Languages

How to calculate the number of distinct sub sequences of length 3 (or in general of length k < n) in an array of length n?

Note: Two sub sequences are considered different if the order of elements in them are different.

Ex: Suppose the array A = [1, 2, 1, 1], then the answer should be 3 because there are only three distinct subsequences of length 3 as shown below:

[1, 1, 1] [1, 2, 1] [2, 1, 1] 

Size of the array n <= 10^5, each element in the array A_i <= n.

My approach:

I figured the brute force approach, i.e., to take tuples of length 3 and insert it into a map. But this is neither space/time efficient.

Edit: This was an interview question and it said that for k = 3 the expected time and space complexity is O(n).

 


As is often the case with interview problems, there's a dynamic programming solution. Let T(m, k) be the number of distinct length-k subsequences of the first m elements. Then assuming one-based indexing on the input A, we have a 2D recurrence

T(m, 0) = 1 T(m, k) = T(m-1, k) + T(m-1, k-1) -           ^^^^^^^^^   ^^^^^^^^^^^      A_m not chosen   A_m chosen              { T(i-1, k-1), if i < m is the maximum index where A_i = A_m             { 0,           if no such index exists 

The subtracted term ensures that we don't count duplicates; see https://stackoverflow.com/a/5152203/2144669 for more explanation.

The running time (with a hash map to maintain the rightmost occurrence so far of each symbol seen) is O(k n), which is O(n) for k = 3.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: