- A+

How to calculate the number of distinct sub sequences of length `3`

(or in general of length `k < n`

) in an array of length `n`

?

**Note:** Two sub sequences are considered different if the order of elements in them are different.

Ex: Suppose the array `A = [1, 2, 1, 1]`

, then the answer should be `3`

because there are only three distinct subsequences of length `3`

as shown below:

`[1, 1, 1] [1, 2, 1] [2, 1, 1] `

Size of the array `n <= 10^5`

, each element in the array `A_i <= n`

.

My approach:

I figured the brute force approach, i.e., to take tuples of length `3`

and insert it into a map. But this is neither space/time efficient.

**Edit**: This was an interview question and it said that **for k = 3** the expected time and space complexity is `O(n)`

.

As is often the case with interview problems, there's a dynamic programming solution. Let `T(m, k)`

be the number of distinct length-`k`

subsequences of the first `m`

elements. Then assuming one-based indexing on the input `A`

, we have a 2D recurrence

`T(m, 0) = 1 T(m, k) = T(m-1, k) + T(m-1, k-1) - ^^^^^^^^^ ^^^^^^^^^^^ A_m not chosen A_m chosen { T(i-1, k-1), if i < m is the maximum index where A_i = A_m { 0, if no such index exists `

The subtracted term ensures that we don't count duplicates; see https://stackoverflow.com/a/5152203/2144669 for more explanation.

The running time (with a hash map to maintain the rightmost occurrence so far of each symbol seen) is `O(k n)`

, which is `O(n)`

for `k = 3`

.