上QQ阅读APP看书，第一时间看更新

Caching

Our good friend Mr. Smith has improved his programming skills considerably. Currently, he is learning about recursive functions and he programmed his first recursive piece of code. He wrote a simple seven-liner which calculates the n^th element in the Fibonacci sequence:

function TfrmFibonacci.FibonacciRecursive(element: int64): int64;
begin
  if element < 3 then
    Result := 1
  else
    Result := FibonacciRecursive(element - 1) +
              FibonacciRecursive(element - 2);
end;

I will not argue with him—if you look up the definition of a Fibonacci sequence it really looks like it could be perfectly solved with a recursive function.

A sequence of Fibonacci numbers, F, is defined with two simple rules:

First two numbers in the sequence are both 1 (F₁ = 1, F₂ = 1),
Every other number in the sequence is the sum of the preceding two (F_n = F_n-1 + F_n-2).

You will also find a different definition of the Fibonacci sequence in the literature, starting with values 0 and 1, but it only differs from our definition in the initial zero. Our definition will produce the sequence 1, 1, 2, 3, 5, 8 ... while the second definition will produce 0, 1, 1, 2, 3, 5, 8.

As I've said, a naive approach to writing a Fibonacci function is to write a recursive function, as Mr. Smith did. This way, however, leads to an exceedingly slow program.

Try it yourself. A program, Fibonacci, from the code archive for this book implements Mr. Smith's functions in the FibonacciRecursive method. Enter a number up to 30 in the Element number edit field, click Recursive and you'll get your answer in a few milliseconds. Increase this value to 40 and you'll have to wait about a second!

The numbers show that if you increase the element number just by one, the calculation time goes up about 50%. That accumulates quickly. My computer needs more than 13 seconds to calculate element number 46, and 96 seconds—a minute and a half!—for element number 50. What is going on?

Calculating Fibonacci numbers the recursive way

The problem with the naive implementation is that it has O(2ⁿ) complexity. When you calculate a value of the function, the first call to FibonacciRecursive causes two new calls. These cause two more calls each (2*2 = 4) and these four cause two more calls each (2*4 = 8) and so on. We have therefore, 2⁰ (= 1) calls initially, 2¹ = 2 on the first level of recursion, 2² = 4 on the second level, 2³ = 8 on the third, and that goes on and on until we need to calculate elements 1 or 2.

A careful examination of these recursive calls shows us something interesting. Let's say we want to calculate the 10^th element in the sequence, F₁₀. That causes recursive calls to calculate F₉ and F₈. F₉ causes recursive calls for F₈ and F₇, while F₈ causes recursive calls for F₇ and F₆. In the next level of recursion we have to calculate F₇ an F₆, F₆ and F₅, F₆ and F₅ again, and F₅ and F₄. We see that we are calculating the same values over and over again, and that definitely doesn't help the execution speed.

A solution to this type of problem is caching (sometimes also called memoization). Simply put, whenever we calculate a result corresponding to some inputs, we put it into a storage—a cache. When we have to calculate a value corresponding to the same inputs, we just pull it from the cache.

This is a very powerful technique that brings in its own collection of problems. We have to decide the size of the cache. Maybe we could cache every already-calculated value, but they will often be too abundant to store. We also have to create and destroy this cache somewhere.

When dealing with the Fibonacci sequence, the size of the cache is not really a problem. If we want the n^th Fibonacci number, we will only need to store values of elements from 1 to n. We can therefore represent the cache by a simple array:

var
  FFibonacciTable: TArray<int64>;

function TfrmFibonacci.FibonacciMemoized(element: int64): int64;
var
  i: Integer;
begin
  SetLength(FFibonacciTable, element+1);
  for i := Low(FFibonacciTable) to High(FFibonacciTable) do
    FFibonacciTable[i] := -1;
  Result := FibonacciRecursiveMemoized(element);
end;

The main function, FibonacciMemoized, firstly creates the cache FFibonacciTable. As dynamic arrays start counting elements with 0, the code allocates one element more than needed, simply to be able to address this array with index [element] instead of [element-1]. One int64 more for better code clarity; not a bad compromise!

Secondly, the code initializes all cache elements to -1. As the elements of the Fibonacci sequence are always a positive number, we can use -1 to represent an uninitialized cache slot.

At the end, the code calls the memoized version of the recursive function.

As dynamic arrays are managed by the compiler, we don't have to destroy the cache after the calculation.

The recursive method, FibonacciRecursiveMemoized, is very similar to Mr. Smith's code, except that it adds the cache management. At the beginning, it will check the cache to see if the value for the current element was already calculated. If it was, the result is simply taken from the cache. Otherwise, the value is calculated recursively and the result is added to the cache:

function TfrmFibonacci.FibonacciRecursiveMemoized(element: int64): int64;
begin
  if FFibonacciTable[element] >= 0 then
    Result := FFibonacciTable[element]
  else
  begin
    if element < 3 then
      Result := 1
    else
      Result := FibonacciRecursiveMemoized(element - 1) +
                FibonacciRecursiveMemoized(element - 2);
    FFibonacciTable[element] := Result;
  end;
end;

In the Fibonacci program, you can use the Recursive + memoization button to test this version. You'll see that it will need very little time to calculate Fibonacci sequence elements up to number 92. On my computer it always needs less than 1 millisecond.

Why 92, actually? Well, because the 93^rd Fibonacci number exceeds the highest int64 value and the result turns negative. Indeed, F₉₂ is larger than 2⁶³, which is a really big number by itself!

So, is this memoized solution a good way of calculating Fibonacci numbers? No, it is not! A much better way is just to start with the first two values (1 and 1), then add them to get F₃ (1 + 1 = 2), add F₂ and F₃ to get F₄ (1 + 2 = 3), and so on. This iterative solution is much faster than the cached recursive solution although our timing doesn't show that. The precision is simply not good enough and both versions will show a running time of 0 milliseconds:

function TfrmFibonacci.FibonacciIterative(element: int64): int64;
var
  a,b: int64;
begin
  a := 1;
  b := 0;
  repeat
    if element = 1 then
      Exit(a);
    b := b + a;
    if element = 2 then
      Exit(b);
    a := a + b;
    Dec(element, 2);
  until false;
end;