JavaScript concurrency principles: Parallelize, Synchronize, Conserve
Now that we've been through the basics of what concurrency is, and its role in front-end web development, let's look at some fundamental concurrency principles of JavaScript development. These principles are merely tools that inform our design choices when we write concurrent JavaScript code.
When we apply these principles, they force us to step back and ask the appropriate questions before we move forward with implementation. In particular, they're the why and how questions:
- Why are we implementing this concurrent design?
- What do we hope to get out of it that we couldn't otherwise get out of a simpler synchronous approach?
- How do we implement concurrency in a way that's unobtrusive to the features of our applications?
Here's a reference visualization of each concurrency principle, feeding on one another during the development process. And with that, we'll turn our attention to each principle for further exploration:
Parallelize
The parallelize principle means taking advantage of modern CPU capabilities to compute results in less time. This is now possible in any modern browser or NodeJS environment. In the browser, we can achieve true parallelism using web workers. In Node, we can achieve true parallelism by spawning new processes. Here's what the CPU looks like from the browser's perspective:
With the goal being more computations in less time, we must now ask ourselves why we want to do this? Besides the fact that raw performance is super cool in it's own right, there has to be some tangible impact for the user. This principle makes us look at our parallel code and ask—what does the user get out of this? The answer is that we can compute using larger data sets as input, and have a smaller opportunity of an unresponsive user experience due to long-running JavaScript.
It's important to scrutinize the tangible benefit of going parallel because when we do so, we add complexity to our code that wouldn't otherwise be there. So if the user sees the same result no matter what we do, the parallelize principle probably isn't applicable. On the other hand, if scalability is important and there's a strong possibility of growing data set sizes, the trade off of code simplicity for parallelism is probably worthwhile. Here's a checklist to follow when thinking about the parallelize principle:
- Does our application perform expensive computations against large data sets?
- As our data sets grow in size, is there potential for processing bottlenecks that negatively impact the user experience?
- Do our users currently experience bottlenecks in our application's performance?
- How feasible is parallelism in our design, given other constraints? What are the trade-offs?
- Do the benefits of our concurrency implementation outweigh the overhead costs, either in terms of user-perceived latency or in terms of code maintainability?
Synchronize
The synchronize principle is about the mechanisms used to coordinate concurrent actions and the abstractions of those mechanisms. Callback functions are a JavaScript notion with deep roots. It's the obvious tool of choice when we need to run some code, but we don't want to run it now. We want to run it when some condition becomes true. By and large, there's nothing inherently wrong with this approach. Used in isolation, the callback pattern is probably the most succinct, readable concurrency pattern that we can use. Callbacks fall apart when there are plenty them, and lots of dependencies between them.
The Promise API
The Promise API is the core JavaScript language construct, introduced in ECMAScript 6 to address the synchronization woes faced by every application on the planet. It's a simple API that actually makes use of callbacks (yes, we're fighting callbacks with callbacks). The aim of promises isn't to eliminate callbacks, it's to remove the unnecessary callbacks. Here's what a promise that's used to synchronize two network fetch calls looks like:
What's crucial about promises is that they're a generic synchronization mechanism. This means that they're not specifically made for network requests, web workers, or DOM events. We, the programmers, have to wrap our asynchronous actions with promises and resolve them as necessary. The reason why this is a good thing is because the callers that rely on the promise interface don't care about what's going on inside the promise. As the name implies, it's a promise to resolve a value at some point. This could be in 5 seconds or immediately. The data can come from a network resource or a web worker. The caller doesn't care, because it makes an assumption of concurrency, which means we can fulfill it any in way we like without breaking the application. Here's a modified version of the preceding diagram, which will give us a taste of what promises make possible:
When we learn to treat values as values at some point in the future, concurrent code is suddenly much more approachable. Promises, and similar mechanisms, can be used to synchronize just network requests, or just web worker events. But they're real power is using them to write concurrent applications, where concurrency is the default. Here's a checklist to reference when thinking about the synchronize principle:
- Does our application heavily rely on callback functions as a synchronization mechanism?
- Do we often have to synchronize more than one asynchronous event such as network requests?
- Do our callback functions contain more synchronization boilerplate code than application code?
- What kind of assumptions does our code make about the concurrency mechanisms that drive asynchronous events?
- If we had a magic kill concurrency button, would our application still behave as expected?
Conserve
The conserve principle is about saving on compute and memory resources. This is done by using lazy evaluation techniques. The name lazy stems from the idea that we don't actually compute a new value until we're sure we actually need it. Imagine an application component that renders page elements. We can pass this component the exact data that it needs to render. This means that several computations take place before the component actually needs it. It also means that the data that's used needs to be allocated into memory, so that we can pass it to the component. There's nothing wrong with this approach. In fact, it's the standard way to pass data around in our JavaScript components.
The alternative approach uses lazy evaluation to achieve the same result. Rather than computing the values to be rendered, then allocating them in a structure to be passed, we compute one item, and then render it. Think of this as a kind of cooperative multi-tasking, where the larger action is broken down into smaller tasks that pass the focus of control back and forth.
Here's an eager approach to compute data and pass it to the component that renders UI elements:
There's two undesirable aspects to this approach. First, the transformation happens up-front, which could be a costly computation. What happens if the component is unable to render it for whatever reason—due to some constraint? Then we've performed this computation to transform data that wasn't needed. As a corollary, we've allocated a new data structure for the transformed data so that we could pass it to our component. This transient memory structure doesn't really serve any purpose, as it's garbage-collected immediately. Let's take a look at what the lazy approach might look like:
Using the lazy approach, we're able to remove the expensive transform computation that happens up-front. Instead, we transform only one item at a time. We're also able to remove the up-front allocation of the transformed data structure. Instead, only the transformed item is passed into the component. Then, the component can ask for another item or stop. The conserve principle uses concurrency as a means to only compute what's needed and only allocate memory that's needed.
The following checklist will help us think about the conserve principle when writing concurrent code:
- Are we computing values that are never used?
- Do we only allocate data structures as a means to pass them from one component to the next?
- Do we chain-together data transformation actions?