
Storage access speeds
Computers are so fast that it can be difficult to understand which operation is a quick operation and which one is slow. Everything appears instant. In fact, anything that happens in less than a few hundred milliseconds is imperceptible to humans. However, certain things are much faster than others are, and you only get performance issues at scale when millions of operations are performed in parallel.
There are various different resources that can be accessed by an application and a selection of these are listed as follows:
- CPU caches and registers:
- L1 cache
- L2 cache
- L3 cache
- RAM
- Permanent storage:
- Local Solid State Drive (SSD)
- Local Hard Disk Drive (HDD)
- Network resources:
- Local Area Network (LAN)
- Regional networking
- Global internetworking
Virtual Machines (VMs) and cloud infrastructure services may simplify deployment but could add more performance complications. The local disk that is mounted on a machine may in fact be a shared network disk and respond much slower than a real physical disk that is attached to the same machine. You may also have to contend with other users for resources.
In order to appreciate the differences in speed between the various forms of storage, consider the following graph. This shows the time taken to retrieve a small amount of data from a selection of storage mediums:

This graph has been made for this book and it uses averages of latency data found online. It has a logarithmic scale, which means that the differences are very large. The top of the graph represents one second or one billion nanoseconds. Sending a packet across the Atlantic Ocean and back takes roughly 150 milliseconds (ms) or 150 million nanoseconds (ns), and this is mainly limited by the speed of light. This is still far quicker than you can think about, and it will appear instantaneously. Indeed, it can often take longer to push a pixel to a screen than to get a packet to another continent.
The next largest bar is the time that it takes a physical HDD to move the arm into position to start reading data (10 ms). Mechanical devices are slow.
The next bar down is how long it takes to randomly read a small block of data from a local SSD, which is about 150 microseconds. These are based on Flash memory technology, and they are usually connected in the same way as an HDD.
The next value is the time taken to send a small datagram of 1 KB (1 kilobyte or 8 kilobits) over a gigabit LAN, which is just under 10 microseconds. This is typically how servers are connected in a data center. Note how the network itself is pretty quick. The thing that really matters is what you are connecting to at the other end. A network lookup to a value in memory on another machine can be much quicker than accessing a local drive (as this is a log graph, you can't just stack the bars).
This brings us to the main memory or RAM. This is fast (about 100 ns for a lookup), and this is where most of your program will run. However, this is not directly connected to the CPU, and it is slower than the on die caches. RAM can be large, often large enough to hold all your working datasets. However, it is not as big as disks can be, and it is not permanent. It disappears when the power is lost.
The CPU itself will contain small caches for data that are currently being worked on, which can respond in less than 10 ns. Modern CPUs may have up to three or even four caches of increasing size and latency. The fastest (less than 1 ns to respond) is the level 1 (L1) cache, but this is also usually the smallest. If you can fit your working data into these few MB or KB of data in caches, then you can process it very quickly.