Why this implementation of Fibonacci is extremely fast?

• A+
Category：Languages

This implementation of Fibonacci is easy to understand but very slow:

fib 0 = 0 fib 1 = 1 fib n = fib (n-1) + fib (n-2)

Following implementation of Fibonacci is hard to understand but super fast. It calculates 100,000th Fibonacci number instantly on my laptop.

fib = fastFib 1 1  fastFib _ _ 0 = 0 fastFib _ _ 1 = 1 fastFib _ _ 2 = 1 fastFib a b 3 = a + b fastFib a b c = fastFib (a + b) a (c - 1)

What the magic is happening here about the latter implementation and how does it work?

Why is the first implementation slow?

Well it’s slow because each call to fib may result in up to two (the average is more like 1.6) calls to fib, so to compute fib 5 you call fib 4 and fib 3 which respectively call fib 3 and fib 2, and fib 2 and fib 1, so we can see that each call to fib (n+1) results in something like twice as much work as calling fib n.

One thing we might observe is that we work out the same thing lots of times, e.g. above we work out fib 3 twice. That could take a long time if you had to work out e.g. fib 100 twice.

How to do fib faster?

I think it’s better to start with this than trying to jump straight into fastFib. If I asked you to compute the tenth Fibonacci number by hand, I expect you wouldn’t be computing the third one dozens of times by applying the algorithm. You would probably remember what you had so far. Indeed one could do that for this in Haskell. Just write a program to generate the list of Fibonacci numbers (lazily) and index into it:

mediumFib = (/n -> seq !! n) where   seq = 0:1:[mediumFib (i-1) + mediumFib (i-2)| i <- [2..]]

This is much faster but it is bad because it is using a lot of memory to store the list of Fibonacci numbers, and it is slow to find the nth element of a list because you have to follow a lot of pointers.

To compute a single Fibonacci number from scratch (ie not having any computed already) takes quadratic time.

Another way you might compute the tenth Fibonacci number by hand is by writing down the Fibonacci sequence until you get to the tenth element. You then never need to look far in the past or remember all the things you previously computed, you just need to look at the two previous elements. One can imagine an imperative algorithm to do this

fib(n):   if (n<2) return n   preprevious = 0   previous = 1   i = 2   while true:     current = preprevious + previous     if (i = n) return current     preprevious, previous = previous, current

This is just stepping through the recurrence relation:

f_n = f_(n-2) + f_(n-1)

Indeed we can write it in Haskell too:

fastFib n | n < 2     = n           | otherwise = go 0 1 2 where   go pp p i | i = n     = pp + p             | otherwise = go p (pp + p) (i + 1)

This is pretty fast now and we can transform this into the function you have too. Here are the steps:

1. Swap the argument order of pp (preprevious) and p (previous)
2. Instead of counting i up, start at n and count down.
3. Extract go into a top level function because it no longer depends on n.

This algorithm only needs to do one sum in each step so it is linear time and that’s pretty fast. To compute fib (n+1) is only a small constant more work than computing fib n. Compare this to above where it was about 1.6 times as much work.

Is there a faster fib?

Sure there is. It turns out there’s a clever way to express the Fibonacci sequence. We consider the transformation a,b -> a+b,a to be a special case of a family of transformations T_pq:

T_pq : a -> bq + aq + ap        b -> bp + aq

Specifically it is the special case where p = 0 and q = 1. We now can do some algebra to work out if there is a simple way to express applying T_pq twice:

T_pq T_pq :   a -> (bp + aq)q + (bq + aq + ap)(q + p)   b -> (bp + aq)p + (bq + aq + ap)q =   a -> (b + a)(q^2 + 2pq) + a(q^2 + p^2)   b -> b(q^2 + p^2) + a(q^2 + 2pq) = T_(q^2 + p^2),(q^2 + 2pq)

So now let’s write a simple function to compute T_pq^n (a,b) and fib n

tPow p q a b n | n = 1 = (b*q + a*q + a*p, b*p + a*q)                | otherwise = let (a', b') = tPow p q a b 1 in tPow p q a' b' (n-1)  fib 0 = 0 fib 1 = 1 fib n = fst \$ tPow 0 1 1 0 (n-1)

And now we can use our relation to make tPow faster:

tPow p q a b n | n = 1 = (b*q + a*q + a*p, b*p + a*q)                | odd n = let (a', b') = tPow p q a b 1 in tPow p q a' b' (n-1)                | even n = tPow (q*q + p*p) (q*q + 2*p*q) a b (n `div` 2)

Why is this faster? Well it’s faster because then computing fib (2*n) is only a small constant more work than computing fib n, whereas before it was twice as much work and before that it was four times as much work and before that it was the square of the amount of work. Indeed the number of steps is something like the number of bits of n in binary plus the number of 1s in the binary representation of n. To compute fib 1024 only takes about 10 steps whereas the previous algorithm took about 1000. Computing the billionth Fibonacci number only takes 30 steps, which is a lot less than a billion.