Hands-On Game Development with WebAssembly

上QQ阅读APP看书，第一时间看更新

Why is WebAssembly faster than JavaScript?

As I have mentioned, WebAssembly is 10–800% faster than JavaScript, depending on the application. To understand why, I need to talk a little about what a JavaScript engine does when it runs JavaScript code versus what it has to do when it runs WebAssembly. I am going to talk specifically about V8 (the Chrome JavaScript engine), although, to my knowledge, the same general process exists within SpiderMonkey (Firefox) and the Chakra (IE & Edge) JavaScript engines.

The first thing the JavaScript engine does is parse your source code into an Abstract Syntax Tree (AST). The source is broken into branches and leaves based on the logic within your application. At this point, an interpreter starts processing the language that you are currently executing. For many years, JavaScript was just an interpreted language, so, if you ran the same code in your JavaScript 100 times, the JavaScript engine had to take that code and convert it to machine code 100 times. As you can imagine, this is wildly inefficient.

The Chrome browser introduced the first JavaScript JIT compiler in 2008. A JIT compiler contrasts with an Ahead-of-Time (AOT) compiler in that it compiles your code as it is running that code. A profiler sits and watches the JavaScript execution looking for code that repeatedly executes. Whenever it sees code executed a few times, it marks that code as "warm" for JIT compilation. The compiler then compiles a bytecode representation of that JavaScript "stub" code. This bytecode is typically an Intermediate Representation (IR), one step removed from the machine-specific assembly language. Decoding the stub will be significantly faster than running the same lines of code through our interpreter the next time.

Here are the steps needed to run JavaScript code:

Figure 1.1: Steps required by a modern JavaScript engine

While all of this is going on, there is an optimizing compiler that is watching the profiler for "hot" code branches. The optimizing compiler then takes these code branches and optimizes the bytecode that was created by the JIT into highly optimized machine code. At this point, the JavaScript engine has created some super fast running code, but there is a catch (or maybe a few).

The JavaScript engine must make some assumptions about the data types to have an optimized machine code. The problem is, JavaScript is a dynamically typed language. Dynamic typing makes it easier for a programmer to learn how to program JavaScript, but it is a terrible choice for code optimizers. The example I often see is what happens when JavaScript sees the expression c = a + b (although we could use this example for almost any expression).

Just about any machine code that performs this operation does it in three steps:

Load the a value into a register.
Add the b value into a register.
Then store the register into c.

The following pseudo code was taken from section 12.8.3 of the ECMAScript® 2018 Language Specification and describes the code that must run whenever the addition operator (+) is used within JavaScript:

1. Let lref be the result of evaluating AdditiveExpression.
2. Let lval be ? GetValue(lref).
3. Let rref be the result of evaluating MultiplicativeExpression.
4. Let rval be ? GetValue(rref).
5. Let lprim be ? ToPrimitive(lval).
6. Let rprim be ? ToPrimitive(rval).
7. If Type(lprim) is String or Type(rprim) is String, then
   a. Let lstr be ? ToString(lprim).
   b. Let rstr be ? ToString(rprim).
   c. Return the string-concatenation of lstr and rstr.
8. Let lnum be ? ToNumber(lprim).
9. Let rnum be ? ToNumber(rprim).
10.Return the result of applying the addition operation to lnum and      
   rnum.

You can find the ECMAScript® 2018 Language Specification on the web at https://www.ecma-international.org/ecma-262/9.0/index.html.

This pseudo code is not the entirety of what we must evaluate. Several of these steps are calling high-level functions, not running machine code commands. GetValue for example, has 11 steps of its own that are, in turn, calling other steps. All of this could end up resulting in hundreds of machine opcodes. The vast majority of what is happening here is type checking. In JavaScript, when you execute a + b, each one of those variables could be any one of the following types:

Integer
Float
String
Object
Any combination of these

To make matters worse, objects in JavaScript are also highly dynamic. For example, maybe you have defined a function called Point and created two objects with that function using the new operator:

function Point( x, y ) {
    this.x = x;
    this.y = y;
}

var p1 = new Point(1, 100);
var p2 = new Point( 10, 20 );

Now we have two points that share the same class. Say we added this line:

p2.z = 50;

This would mean that these two points would then no longer share the same class. Effectively, p2 has become a brand new class, and this has consequences for where that object exists in memory and available optimizations. JavaScript was designed to be a highly flexible language, but this fact creates a lot of corner cases, and corner cases make optimization difficult.

Another problem with optimization created by the dynamic nature of JavaScript is that no optimization is definitive. All optimizations around typing have to use resources continually checking to see whether their typing assumptions are still valid. Also, the optimizer has to keep the non-optimized code just in case those assumptions turn out to be false. The optimizer may determine that assumptions made initially turn out not to have been correct assumptions. That results in a "bailout" where the optimizer will throw away its optimized code and deoptimize, causing performance inconsistencies.

Finally, JavaScript is a language with Garbage Collection (GC), which allows the authors of the JavaScript code to take on less of the burden of memory management while writing their code. Although this is a convenience for the developer, it just pushes the work of memory management on to the machine at run time. GC has become much more efficient in JavaScript over the years, but it is still work that the JavaScript engine must do when running JavaScript that it does not need to do when running WebAssembly.

Executing a WebAssembly module removes many of the steps required to run JavaScript code. WebAssembly eliminates parsing because the AOT compiler completes that function. An interpreter is unnecessary. Our JIT compiler is doing a near one-to-one translation from bytecode to machine code, which is extremely fast. JavaScript requires the majority of its optimizations because of dynamic typing that does not exist in WebAssembly. Hardware agnostic optimizations can be done in the AOT compiler before the WebAssembly compiles. The JIT optimizer need only perform hardware-specific optimizations that the WebAssembly AOT compiler cannot.

Here are the steps performed by the JavaScript engine to run a WebAssembly binary:

Figure 1.2: The steps required to execute WebAssembly

The last thing that I would like to mention is not a feature of the current MVP, but a potential future enabled by WebAssembly. All the code that makes modern JavaScript fast takes up memory. Keeping old copies of the nonoptimized code for bailout takes up memory. Parsers, interpreters, and garbage collectors all take up memory. On my desktop, Chrome frequently takes up about 1 GB of memory. By running a few tests on my website using https://www.classicsolitaire.com, I can see that with the JavaScript engine turned on, the Chrome browser takes up about 654 MB of memory.

Here is a Task Manager screenshot:

Figure 1.3: Chrome Task Manager process screenshot with JavaScript

With JavaScript turned off, the Chrome browser takes up about 295MB.

Here is a Task Manager screenshot:

Figure 1.4: Chrome Task Manager process screenshot without JavaScript

Because this is one of my websites, I know there are only a few hundred kilobytes of JavaScript code on that website. It's a little shocking to me that running that tiny amount of JavaScript code can increase my browser footprint by about 350 MB. Currently, WebAssembly runs on top of the existing JavaScript engines and still requires quite a bit of JavaScript glue code to make everything work, but in the long run, WebAssembly will not only allow us to speed up execution on the web but will also let us do it with a much smaller memory footprint.