Installing MongoDB on a customer site
In this section, you are given an overview of what is needed to install MongoDB on a server running Linux, Windows, or macOS, located at a customer site. Later, in another section, you are shown how to create a virtual test environment in which you can practice the techniques discussed in this book. Before you can start on the installation, however, it is important to review which version of MongoDB is desired, as well as host computer hardware requirements.
Available versions of MongoDB 4.x
There are a number of versions of MongoDB 4.x available, including the following:
- MongoDB Atlas and Stitch for the cloud environment
- MongoDB Enterprise Edition
- MongoDB Community Edition
The last product listed is free and tends to have a faster release cycle. Interestingly, MongoDB asserts that the Community Edition, unlike the model used by other open source companies, uses the same software pool for its beta versions as does the Enterprise Edition. The Enterprise Edition, when purchased, also includes support as well as advanced security features, such as Kerberos and Lightweight Directory Access Protocol (LDAP) authentication.
Let's now look at the RAM, CPU, and storage requirements.
Understanding RAM and CPU requirements
The minimum CPU requirements for MongoDB are quite light. The MongoDB team recommends that each mongod instance be given access to two cores in a cloud environment, or a multi-core CPU when running on a physical server. The WiredTiger engine is multithreaded and is able to take advantage of multiple cores. Generally speaking, the number of CPU cores available improves throughput and performance. When the number of concurrent users increases, however, if operations occupy all available cores, a drastic downturn in throughput and performance is observed.
As of MongoDB 3.4, the WiredTiger internal cache will default either to 50% of RAM (1 GB), or 256 megabytes (MB), whichever is larger.
Accordingly, you need to make sure that you have enough RAM available on your server to handle other OS demands. So, following this formula, if your server has 64 GB of RAM, MongoDB will grab 31.5 GB! You can adjust this parameter, of course, by setting the cacheSizeGB parameter in the MongoDB configuration file. Values for <number>, as illustrated in the following code snippet, represent the size in GB, and can range from 0.25 to 10,000:
storage.wiredTiger.engineConfig.cacheSizeGB : <number>
Another consideration is what type of RAM is being used. In many cases, especially in corporate environments where server purchasing and server maintenance are done by different groups, memory might have been swapped around to the point where your server suffers from non-uniform memory access (NUMA) problems (https://queue.acm.org/detail.cfm?id=2513149).
Examining storage requirements
The amount of disk space used by MongoDB depends on how much data you plan to house, so it's difficult to give you an exact figure. The type of drive, however, will have a direct impact on read and write performance. For example, if your data is able to fit into a relatively smaller amount of drive space, you might consider using a Solid State Drive (SSD) rather than a standard hard disk drive due to radical differences in speed and performance. On the other hand, for massive amounts of data, an SSD might not prove economical.
Here are some other storage recommendations for MongoDB:
- Swap space: Use the tools available for your OS to create a swap partition or swap file. Many DevOps professionals no longer follow the traditional formula that states the swap space should be equal to twice the size of RAM. Although increasing the size of swap on your server will not have a direct impact on performance, follow the traditional formula to ensure all other OS services are performing well, which, in turn, will have a positive effect on MongoDB. Thus, if your server has 64 GB of RAM, the swap space would be 128 GB.
- RAID: The official MongoDB recommendation for Redundant Array of Inexpensive Drives or Redundant Array of Independent Disks (RAID) is RAID 10 and is also called RAID 1+0 at a minimum. This is a nested RAID level, which involves a RAID 0 array of RAID 1 arrays of disks. RAID levels 0 to 6 by themselves are considered inadequate. RAID levels 50, 60, or 100 would also be considered adequate.
- Remote filesystems: Although you can host your MongoDB database files on a remote filesystem, it is not a recommended arrangement. Remote filesystems can take the form of Network File System (NFS) and mount points in *nix (that is, Unix, Linux, or macOS) and Windows shares (Windows networks).
A critical aspect of storage requirements has to do with how many Input/Output Operations Per Second (IOPS) the disk supports. If your use case requires 1,000 operations per second and your disk supports 500 IOPS, you will have to shard to achieve your ideal performance requirements. Additionally, you should consider the compression feature of the WiredTiger storage engine. As an example, if you have 1 terabyte (TB) of data, MongoDB will typically achieve a 30% compression factor, reducing the size of your physical disk requirements.
Choosing the filesystem type
Each OS offers a choice of filesystem types. Some filesystem types are older and less performant. However, they are not good candidates to run MongoDB. There are two main reasons for this: the need to support legacy applications and legacy files. The other reason is that OS vendors want to give their customers more choice.
The recommended filesystem types are summarized in this table:
Let's now look at other server considerations.
Other server considerations
In addition to the hardware considerations listed previously, here are some additional points to consider:
- Verifying needed OS libraries: MongoDB relies upon various OS libraries. One very common example is the need for an up-to-date version of the OpenSSL library if you plan to configure MongoDB to use Secure Sockets Layer/Transport Layer Security (SSL/TLS) in its communications.
- Clock synchronization: As any DevOp is aware, internal computer clocks are notorious for their drift. Accordingly, many schemes have been used over the years to keep computer clocks in sync. The most prevalent technology used today is the Network Time Protocol (NTP). Servers run an internal daemon (or service) that makes occasional checks to one or more primary NTP servers. An internal drift algorithm is used to make micro-adjustments to the server clocks such that the need to check with an NTP server declines over time, as the NTP daemon running locally learns to adjust more and more accurately.
The next section in this chapter expands upon this discussion, taking into consideration the differences between installing MongoDB 3 and installing MongoDB 4.x.