
Types of performance problems
There are many types of performance problems and most of them are independent of the programming language that is used. A lot of these result from how the code runs on the computer, and we will cover the impact of this later on in the chapter.
We will briefly introduce common performance problems here and will cover them in more detail in later chapters of this book. Issues that you may encounter will usually fall into a few simple categories, including the following:
- Latency:
- Memory latency
- Network latency
- Disk and I/O latency
- Chattiness/handshakes
- Bandwidth:
- Excessive payloads
- Unoptimized data
- Compression
- Computation:
- Working on too much data
- Calculating unnecessary results
- Brute forcing algorithms
- Responsiveness:
- Synchronous operations that could be done offline
- Caching and coping with stale data
When writing software for a platform, you are usually constrained by two resources. These are the computation processing speed and accessing remote (to the processor) resources. Processing speed is rarely a limiting factor these days, and this can be traded for other resources, for example, compressing some data to reduce the network transfer time. Accessing remote resources, such as the main memory, disk, and network, will have various time costs. It is important to understand that speed is not a single value and it has multiple parameters. The most important of these parameters are bandwidth and, crucially, latency.
Latency is the lag in time before the operation starts, whereas bandwidth is the rate at which data is transferred once the operation starts. Posting a hard drive has a very high bandwidth, but it also has very high latency. This would make it very slow to send lots of text files back and forth, but perhaps, it is a good choice to send a large batch of 3D videos (depending on the Weissman score). A mobile phone data connection may be better for text files. Although this is a contrived example, the same concerns are often applicable to every layer of the computing stack with similar orders of magnitude in time difference. The problem is that the differences are too quick to perceive, and we need to use tools and science to see them.
The secret to solving performance problems is gaining a deeper understanding of the technology and knowing what happens at lower levels. You should appreciate what the framework is doing with your instructions at the network level. It's also important to have a basic grasp of how these commands run on the underlying hardware and how they are affected by the infrastructure that they are deployed to.