Data Lake for Enterprises
上QQ阅读APP看书,第一时间看更新

Components of a Lambda Architecture

We have been talking about the various components of Lambda Architecture in multiple sections of this book already, and I am sure you will have some idea after going through those sections. This section and the following section detail each and every component of the Lambda Architecture. But this will avoid any dependency on technologies because we need to go under this layer, and once we are through, we can use any technology available in the market and create this pattern without much trouble. Understanding each layer and its significance along with the lead function that it has to take care of is very much required, as this is the basis that you would get when going through future chapters. In the context of Data Lake, the components of Lambda Architecture just form one of the layers, which is termed as the Lambda Layer. We will now go through various layers in this Lambda Layer in detail. The main layers constituting the Lambda Layer are as follows:

  • Batch layer
  • Speed layer
  • Serving layer

A pictorial view of a Lambda Architecture is shown next. The figure shows these important layers in the pattern:

Figure 02: Components of a Lambda Architecture

As you can see very clearly from this diagram, new data is fed to both the batch and speed layers. The batch layer keeps producing and re-computing views for every set batch interval. The speed layer also creates the relevant real-time/speed views. The serving layer orchestrates the query by querying both the batch and speed layers, merges it, and sends the result back.

Whenever new batch views are created, the speed view in place is discarded and only the new data after that batch is taken into consideration for generating the speed views. Also, old batch views are kept and archived or discarded according to the use case or implementation.
In a generic fashion, the new way of handling big data is to follow a data pipeline as shown in the next figure, in which data is taken from the source of truth in the rawest format. Then we create an appropriate view out of it, catering to a business requirement; we use these views as needed. The core working of Lambda Architecture does follow these footsteps by allowing the batch and speed layers to produce appropriate views. Then the serving layer comes in between and does the necessary orchestration of these created views.

Figure 03: Big Data pipeline