上QQ阅读APP看书，第一时间看更新

Apache Zookeeper

Zookeeper is an open source Apache Project that provides a centralized infrastructure and services that enable synchronization across a cluster. It is responsible for maintaining synchronization between different objects such as configuration files, hierarchical namespaces, messaging, notifications, and so on. The Zookeeper framework is mainly designed for use in setup synchronization within a distributed clustered environment. Imagine that you need to build an application for a distributed environment for 10 to 20 servers that requires synchronization services for data maintenance. Despite such a small setup, you will have to write your own services and confront any issues arising from synchronization. In this situation, Zookeeper would be a great support. Zookeeper is also designed to be very fast and work well with heavy loads in a cluster environment.

Zookeeper is written in Java. It exposes its API to be used by the application in Java as well as in C-programming languages.

Now we will go through the architecture of Zookeeper. Zookeeper uses the leader and followers concept, where consistency in data relies on a leader that can provide and verify data. The following Figure illustrates the Zookeeper Quorum configuration:

It is recommended that Zookeeper Quorum has an odd number of nodes, as some features in Zookeeper require a voting mechanism. For example, if you have five nodes in Quorum, three votes are needed as a majority to elect a leader. If you have six nodes in Quorum, you need four votes to win a majority, the same needed if you have seven nodes. One extra node will make your life easier without much overhead.

Whenever a leader is unable to get a majority because of hardware failure, network failure, or any other reason, it steps down and the Zookeeper system becomes read-only. Similarly, when the leader itself fails, the remaining nodes will elect a new leader.

Zookeeper is a fault-tolerant system. It makes sure that replication has been done properly. So, in the case of a write operation, a request first goes to the leader before being forwarded to the available nodes. If the majority of nodes receive the updated data and write it persistently, Zookeeper will not commit and confirms to the client that the data has been successfully written. This all happens in parallel.

The Zookeeper API contains two definitions for every function. One function is for synchronous calls and the other one is for asynchronous calls. You can select any one of them based on your requirements.

Zookeeper works in a tree-like directory structure. Each node in a tree is called a znode. The top-most znode in the hierarchy is called the Parent node, which is also represented by /. Most applications that use Zookeeper create their node after the parent node. The most important thing to note here is that each node can contain data as well as its child node. The following figure illustrates the directory structure of Zookeeper and how it stores information:

There are three types of znode; they are as follows:

Persistent Node: A type of node that once created, will remain stored until it is deleted deliberately by the application or user.
Ephemeral Node: A type of node that once created, will remain there until the client is connected and will be deleted once the connection is closed.
Sequential Node: A type of node that once created, Zookeeper attaches a 10 digit sequence number to, along with a prefix and a create command. It can be persistent or ephemeral in nature.

To sum up the operations of the Zookeeper, it provides highly reliable services while keeping many copies of the data and configuration. Its best use is to only store configuration files and manage the metadata information of your data. In Hadoop v1, Namenode was the single point of failure. Because of which many companies were afraid to use it as if the NameNode goes down, all of their data will be lost. To overcome this factor, in Hadoop v2, Apache introduces Zookeeper to handle NameNode failure. So that if active NameNode goes down, the standby node will take its place, and because of the zookeeper super fast performance, all this happen in a very minimal time leaving no to very little discontinuation in performance.