Learn MongoDB 4.x
上QQ阅读APP看书,第一时间看更新

Performing backup and restore operations

Close your eyes, and picture for a moment that you are riding a roller-coaster. Now, open your eyes and have a look at your MongoDB installation! Even something as simple as backup and restore becomes highly problematic when applied to a MongoDB installation that contains replica sets and where collections are split up into shards

The primary tools for performing backup and restore operations are mongodump and mongorestore

You can also take snapshots of your data rather than attempt to dump the entire local database.  This technique is considered a best practice; however, it is operating system-specific and requires setting up operating system tools that are outside of the purview of this book. For more information, see the Backup and Restore Using Filesystem Snapshots topic in the MongoDB documentation. For documentation on mongodump, see  https://docs.mongodb.com/manual/reference/program/mongodump/#mongodump. For documentation on mongorestore, see  https://docs.mongodb.com/manual/reference/program/mongorestore/#mongorestore.
Using mongodump to back up a local database

The primary backup tool is mongodump, which is run from the command line (not from the mongo shell!). This command is able to back up the local MongoDB database of any accessible host. The output from this command is in the form of Binary Serialized Object Notation (BSON) documents, which are compressed binary equivalents of JSON documents. When backing up, there are a number of options available in the form of command-line switches, which affect what is backed up, and how.

If you want to back up everything in the local MongoDB database, just type mongodump. A directory called dump will be created; each database will have its own subdirectory, and each collection its own backup file.
For documentation on BSON, see  http://bsonspec.org/. For documentation on JSON, see  http://json.org/.
mongodump options summary

To invoke a given option, simply add it to the command line following mongodump. Options can be combined, although some will be mutually exclusive. Here is a list of the mongodump command-line options:


Some options are available as single letters. For example, instead of using the --verbose option, you could alternatively add -v.

Type the following command to get more details on these options along with more examples of single-letter alternatives: mongodump --help.
Security options (authentication, SSL, and Transport Layer Security ( TLS)) are covered in more detail in Chapter 11, Administering MongoDB Security. For a complete list of mongodump options, see  https://docs.mongodb.com/manual/reference/program/mongodump/#options.
Things to consider before you back up

Here are two important considerations you should address before performing a backup:

  • Performance impact: When you perform a backup, the amount of data being processed by the local server increases geometrically. Inevitably, this will have a negative impact on performance. You might want to consider scheduling the backup for a time when the least amount of database activity is expected.
  • Replica sets: If the server you are backing up is part of a replica set, backing up a secondary can cause problems as the data being backed up might not be current. You can use the --oplog option to cause the backup to include the operations log, which can be used to identify the point in time when the backup of the primary server in a replica set occurred. 
    It is highly recommended when you back up a replica set member to only back up the primary. This is easily accomplished by adding the following option:
--host=<replica_set_name>/<primary_host_address>:<port>
mongodump is useful for small MongoDB deployments. When you have a large amount of data being handled through replica sets and sharded clusters, it might be better to use alternative backup solutions such as using filesystem snapshots (described in the next sub-section) or external hardware-based solutions.
Backing up a specific database

Here is an example of the sweetscomplete database being backed up to a ~/backup directory at verbosity level 2:

In order to fully test backup and restore, first insert all the sample data from inside the Docker container, as follows:
/path/to/repo/restore_data_inside.sh.
Restoring using mongorestore

The primary MongoDB command to restore data is mongorestore. The command-line options are identical to those listed previously for mongodump, with the following additions:

Things to consider before restoring data

Before restoring data, here are a few important considerations:

  • Restoring a replica set: Restoring a replica set is not as simple as running mongorestore on the primary. There is a very specific procedure that must be followed, which is described in detail in Chapter 13, Deploying a Replica Set. The procedure basically involves bringing the replica set down, restoring data on one server, and then redeploying the replica set.
  • Using the oplog: If you performed a backup using the mongodump --oplog option, when restoring you can add the --oplogReplay option. This will ensure that the restore is performed for a precise point in time.
  • Maintaining document integrity: The --drop option causes the collection to be dropped entirely prior to the restore operation. Bear in mind that the trade-off for added database integrity is that it takes longer to restore. Also consider adding the --objcheck option, which checks the integrity of objects (that is, documents) before database insertion.
Restoring the purchases collection

In this example, we restore the purchases collection to the sweetscomplete database, adding a verbosity level of 2, while maintaining database integrity, as follows:

mongorestore --db=sweetscomplete --collection=purchases --drop --objcheck \
--verbose=2 --dir=/root/backup/sweetscomplete/purchases.bson

Here is the output from this command:

Note that if you use the --db and --collection options, you must also use --dir to indicate the associated BSON file.
Using the --nsInclude switch to restore

An easier way to restore is to use the --nsInclude switch. This allows you to specify a namespace pattern, which could also include an asterisk (*) as a wildcard. Thus, if you wanted to restore all collections of a single database, you could issue this command:

mongorestore --nsInclude sweetscomplete.*

Here is an example of the products collections being restored from the sweetscomplete database:

For documentation on namespace patterns, see https://docs.mongodb.com/manual/reference/limits/#namespaces.

In the last section of this chapter, we have a look at performance monitoring.