Foundations of Blockchain
上QQ阅读APP看书,第一时间看更新

Merkle hash trees

A Merkle tree is a binary tree where all the leaf nodes represent hashes of the data blocks. Each parent node has the hashed value of the hashes of its children. Hashing continues until the root node of the tree. Merkle trees are used to summarize bulk sets of data and create a fingerprint for each set.

A tree is a data structure in computer science that consists of a root node and a subtree of parent and children nodes and is represented by positioning a root node at the top. A binary tree is a tree where each parent has at most two nodes. Merkle trees are used in Bitcoin, Ethereum, and other blockchain applications to summarize all the transactions included in each block. SHA-256 is used as a hash function in bitcoin's Merkle tree, as can be seen in the following diagram:

Figure 2.12: Merkle tree that summarizes all the leaves

A Merkle tree is constructed from the bottom up from the leaf nodes. In Figure 2.12, leaf nodes will consist only of hashed values of data blocks A, B, C, and D represented by HA, HB, HC, and HD respectively. Each parent node will construct its hash by concatenating the hash values of the child nodes and hashing them again:

HAB = Hash (HA + HB)

This process is continued until the root node hash value HABCD is calculated.

Since each Merkle tree node (other than leaf nodes) calculates its hash based on its child nodes, it has to maintain a balanced tree, that is, each node (other than leaf nodes) should have two child nodes. This could be achieved by duplicating existing single child nodes.

Merkle trees not only provide a way of summarizing an entire data block, but they can also efficiently verify whether a data block exists. Verification could be achieved in just log2(n) complexity.