上QQ阅读APP看书，第一时间看更新

Data Lake benefits

Organizations generate a huge amount of data across their business systems and as they grow bigger, they also need to get smarter in handling data across disparate systems.

One of the most basic approaches is to have a single domain model that accurately describes their data and represents the most significant data for their overall business. Such information may be referred to as enterprise data.

An organization that has well-defined enterprise data also has some ways to manage that data so that changes to the definition of data are always consistent and it is well known as to how systems are sharing this information.

In such a case, the systems may be broadly classified as data owners and data consumers. For enterprise data, there needs to be an owner, and that owner defines how the data becomes available to other consuming systems that play the role of data consumers.

Once organisations have this clear definition of data and systems, they can leverage a lot of information with such mechanisms. Nowadays, one of the common ways to envisage this entire model of enterprise data is by building an enterprise-wide Data Lake responsible for capturing, processing, analyzing and serving this data to the consuming systems. Consistent knowledge of this central model can help the organisations with the following:

Data Governance and Lineage
Applying machine learning and artificial intelligence to derive business intelligence
Predictive Analysis, such as a domain-specific recommendation engine
Information traceability and consistency
Historical Analysis to derive dimensional data
A centralized data source for all enterprise data results in data services primarily optimized for data delivery
Helping organizations take more informed decisions for future growth

In this section, we discussed what a Data Lake is capable of? A definitive follow-on in this chapter would be to discuss and summarize how a Data Lake works and can be realized.