上QQ阅读APP看书，第一时间看更新

Database per service

In a distributed microservices architecture, different services have needs and usages of different storage requirements. The relational database is a perfect choice when it comes to maintaining relations and having complex queries. NoSQL databases such as MongoDB is the best choice when there is unstructured complex data. Some may require graph data and thus use Neo4j or GraphQL. The solution is to keep each of the microservices data private to that service and get it accessible only via APIs. Each microservice maintains its datastore and is a private part of that service implementation and hence it is not directly accessible by other services.

Some of the options you have while implementing this mode of data management are as follows:

Private tables/collections per service: Each microservice has a set of defined tables or collections that can only be accessed by that service
Schema per service: Each service has a schema that can only be accessed via the microservice it is bound to
Database per service: Each microservice maintains its own database as per its needs and requirements

When thought of, maintaining a schema per service seems to be the most logical solution as it will have lower overhead and ownership can clearly be made visible. If some services have high usage and throughput and different usage, then maintaining a separate database is the logical option. A necessary step is to add barriers that will restrict any microservice from accessing data directly. Various options to add this barrier include assigning user IDs with restricted privileges or accessing control mechanisms such as grants. This pattern has the following advantages:

Loosely coupled services that can stand on their own; changes to one service's datastore won't affect any other services.
Each service has the liberty to select the datastore as required. Each microservice has the option of whether to go for relational or non-relational databases as per need. For example, any service that needs intensive search results on text may go for Solr or Elasticsearch, whereas any service where there is structured data may go for any SQL database.

This pattern has the following drawbacks and upcomings that need to be handled with care:

Handling complex scenarios that involve transactions spanning across multiple services. The CAP theorem states that it is impossible to have more than two out of the following three guarantees—consistency, availability, and partitions in the distributed datastore, so transactions are generally avoided.
Queries ranging across multiple databases are challenging and resource consuming.
The complexity of managing multiple SQL and non-SQL datastores.

To overcome the drawbacks, the following patterns are used while maintaining a database per service:

Sagas: A saga is defined as a batch sequence of local transactions. Each entry in the batch updates the specified database and moves on by publishing a message or triggering an event for the next entry in the batch to happen. If any entry in the batch fails locally or any business rule is violated, then the saga executes a series of compensating transactions that compensate or undo the changes that were made by the saga batch updates.
API Composition: This pattern insists that the application should perform the join rather than the database. As an example, a service is dedicated to query composition. So, if we want to fetch monthly product distributions, then we first retrieve the products from the product service and then query the distribution service to return the distribution information of the retrieved products.
Command Query Responsibility Segregation (CQRS): The principle of this pattern is to have one or more evolving views, which usually have data coming from various services. Fundamentally, it splits the application into two parts—the command or the operating side and the query or the executor side. It is more of a publisher-subscriber pattern where the command side operates create/update/delete requests and emits events whenever the data changes. The executor side listens for those events and handles those queries by maintaining views that are kept up to date, based on the subscription of events that are emitted by the command or operating side.