Delphi High Performance
上QQ阅读APP看书,第一时间看更新

Speeding up SlowCode

For the last practical example, we can try speeding up Mr. Smith's SlowCode from Chapter 1, About Performance. Here, we immediately ran into a problem. To fix or change an algorithm we must understand what the code does. This happens a lot in practice, especially when you inherit some code. Reading and understanding code that you didn't write is an important skill.

Let's try to understand the first part of SlowCode. The for loop in SlowMethod starts counting with 2. Then it calls ElementInDataDivides, which does nothing as the data list is empty. Next, SlowMethod adds 2 to the list.

Next, i takes the value of 3. ElementInDataDivides checks if 3 is divisible by 2. It is not, so SlowMethod adds 3 to the list.

In the next step, i = 4, it is divisible by 2, and 4 is not added to the list. 5 is then added to the list (it is not divisible by 2 or 3), 6 is not (divisible by 2), 7 is added (not divisible by 2, 3, or 5), 8 is not (divisible by 2), 9 is not (divisible by 3), and so on:

function ElementInDataDivides(data: TList<Integer>; value: Integer): boolean;
var
i: Integer;
begin
Result := True;
for i in data do
if (value <> i) and ((value mod i) = 0) then
Exit;
Result := False;
end;

function SlowMethod(highBound: Integer): TArray<Integer>;
var
i: Integer;
temp: TList<Integer>;
begin
temp := TList<Integer>.Create;
try
for i := 2 to highBound do
if not ElementInDataDivides(temp, i) then
temp.Add(i);
Result := Filter(temp);
finally
FreeAndNil(temp);
end;
end;

We now understand what the first part (before calling the Filter method) does. It calculates prime numbers with a very simple test. It tries to divide each candidate by every already-generated prime. If the candidate number is not divisible by any known prime number, it is also a prime number and we can add it to the list.

Can we improve this algorithm? With a bit of mathematical knowledge, we can do exactly that! As it turns out (the proof is in the next information box and you can safely skip it), we don't have to test divisibility with every generated prime number. If we are testing number value, we only have to test divisibility with a prime number smaller or equal to Sqrt(value) (square root of value).

My assumption is this: If we have number n, which is not divisible by any number smaller than or equal to Sqrt(n), then it is also not divisible by any number larger than Sqrt(n). The simplest way to prove that is by establishing a contradiction.

Let's say that we can divide n by p. We can therefore write n = p * k. As n is not divisible by numbers smaller than or equal to Sqrt(n), both p and k must be strictly larger than Sqrt(n). If that is true, p * k must be strictly larger than Sqrt(n) * Sqrt(n), or n.

On one side we have n = p * k, and on the other p * k > n. These cannot both be true at the same time so we have run into a contradiction. Therefore, the initial assumption is correct. 

We can therefore rewrite ElementInDataDivides so that it will stop the loop when the first prime number larger than Sqrt(value) is encountered. You can find the following code in the SlowCode_v2 program:

function ElementInDataDivides(data: TList<Integer>; value: Integer): boolean;
var
i: Integer;
highBound: integer;
begin
Result := True;
highBound := Trunc(Sqrt(value));
for i in data do begin
if i > highBound then
break;
if (value <> i) and ((value mod i) = 0) then
Exit;
end;
Result := False;
end;

If we compare the original version with time complexity O(n2) with this improved version with time complexity O(n * sqrt n), we can see definite improvement. Still, we can do even better!

Try to search prime number generation on the internet and you'll quickly find a very simple algorithm which was invented way before computers. It is called the sieve of Eratosthenes and was invented by the Greek scholar who lived in the third century BC.

I will not go into details here. For the theory you can check Wikipedia and for a practical example, see the SlowCode_Sieve program in the code archive. Let me just say that the sieve of Eratosthenes has a time complexity of O(n log log n), which is very close to O(n). It should therefore be quite a bit faster than our Sqrt(n) improvement.

The following table compares SlowCode, SlowCode_v2, and SlowCode_Sieve for different inputs. The highest input was not tested with the original program as it would run for more than an hour. All times are in milliseconds: