Overview of the Microsoft SQL Server 2019 technology
Microsoft SQL Server offers a powerful end-to-end data processing platform. In other words, we can gain data from an extensive set of sources, securely and reliably managed, transformed, processed, analyzed, and visualized under an all-in-one license.
The following diagram shows what the bigger picture of the SQL Server technology set looks like:
The preceding diagram shows one of the many possible ways in which technologies within SQL Server can cooperate. SQL Server Database Engine (DE) is depicted twice in the diagram because it possibly plays two major roles within the data processing platform, as follows:
- Line-of-Business (LOB) application contention: In the diagram, at the left occurrence of SQL Server DE, SQL Server provides data ingestion from client applications or other external sources of data.
- Business Intelligence (BI) contention: In the diagram, at the right occurrence of SQL Server DE, the SQL Server instance manages data warehouses, that is, databases used for Analysis Services data model processing or for reporting purposes.
As seen in the preceding diagram, SQL Server contains many technologies used and maintained by just one person. So, the following list shows how specific roles use such technologies:
- Database Administrators (DBAs): DBAs work with SQL Server and SQL Server Agent services, ensuring the continuity of operations, security, disaster recovery (DR) strategies, and similar tasks.
- SQL developers: SQL developers are responsible for the content of databases, from database design and transaction handling to the quality and accuracy of data stored in databases.
- Extract, Transform, Load (ETL) developers: ETL developers' playground lies mainly in SQL Server Integration Services (SSIS) services. ETL developers create a whole ETL workflow and ensure the quality and integrity of data extracted from sources and uploaded to targets of the ETL flow.
- BI developers: BI developers work mainly with reports on SQL Server Reporting Services (SSRS) and with multidimensional data models hosted on SQL Server Analysis Services (SSAS).
Although our attention is focused on DBAs, it is still useful to have a brief idea of other user roles within the same huge technology set. The DBA is mainly responsible for assisting all users.
Now, we will explain all components, including SSIS and SSRS, in more detail in the following sections.
Understanding SQL Server DE
The core service in the SQL Server technology set is the SQL Server DE service. This service covers the following three responsibilities, apart from storing and manipulating data:
- Handling recovery: This responsibility means that after any sudden or a planned breakdown of the service or database, the service will recover every database to its last consistent state without any undone transactions.
- Handling transactions: A transaction is mentioned as a single unit of work, and SQL Server DE guarantees that transactions will be durable and isolated and correctly finished with COMMIT or ROLLBACK.
- Handling security: SQL Server DE resolves every request for authentication and authorization and decides if a user or application is known (authenticated) and if a user or application has permission for certain actions (authorization).
SQL Server does not provide its capabilities to end users only. Still, it's necessary to keep in mind that SQL Server DE serves as a base service for almost every other service in the SQL Server technology stack and note the following important points:
- Every BI service, such as Analysis Services or Reporting Services, is actually a client of SQL Server DE.
- Some services, such as Machine Learning Services, can be installed within or independently of SQL Server DE.
- SQL Server Agent (not seen in the previous diagram) plays an exceptional role in the SQL Server ecosystem. This service exists as an indivisible part of every SQL Server DE application. SQL Server Agent hugely helps administrators, as well as other services or components, to automate routine tasks.
Why do we need this information? It's one of the crucial moments when planning a SQL Server installation. For example, Analysis Services is a heavily resource-consuming service, and its deployment along with SQL Server DE could lead to big performance problems and user disappointment with regard to responses on their requests. From a different perspective, installing SQL Server services on separated operating systems leads to increased license expenses and more complex administration efforts.
The following sections will describe each SQL Server service in detail.
SSIS
SSIS is basically used as a data pump of SQL Server. SSIS is used to maintain data movements and transformations between a wide scale of heterogeneous data sources and destinations, as well as migrating or transforming data between several instances of SQL Server. A very common use case of SSIS is in data warehousing to extract, transform, and save data from online transactional processing (OLTP) databases to a data warehouse.
The working unit of this technology is the SSIS package. This is an executable unit of integration services, and we can think of it as a simple application. Its definition consists of two main parts: control flow and data flow. Control flow contains tasks such as creating a temporary folder (Filesystem task), accessing a File Transfer Protocol (FTP) site (FTP task), and many others. One of the most crucial tasks in control flow is called the data flow task. This data flow task contains a definition of the path that data goes through, from data source to data destination.
The integration service itself is not mandatory for SSIS package execution, but the service is used for integration services packages management. It's installed for backward compatibility with older versions of the SSIS packages deployment model. SSIS packages are now commonly placed into a database called SSISDB. The database is not often accessed directly by users or administrators; it is maintained using the Integration Services Catalog.
From an administrator's point of view, the SSIS service installation could be omitted if all existing SSIS packages are deployed to the Integration Services Catalog, which can be created anytime just by a few clicks in SQL Server Management Studio (SSMS).
Integration services often cooperate with two features for data cleansing, validating, and deduplicating. These services are called Master Data Services (MDS)and Data Quality Services (DQS).
MDS
MDS is a technology that provides a very efficient way to manage data that has to be maintained centrally for more applications (for instance, an organizational structure or chart of accounts) or data that should be cleansed and validated before it is sent to other data destinations such as a data warehouse. From an administrator's perspective, it's a database usually called MDS, MDM, or master_data_services (the administrator can choose the database name) and a website created on Internet Information Services (IIS). MDS is not installed within an SQL Server installer; a graphical tool called Master Data Services Configuration Manager is used for its installation and configuration.
Loading data into a MDS database is often done using SSIS. Then, the data is optionally cleansed by data stewards. Clean and consolidated data could be subscribed via subscription views. Definitions of these views are created through the management of the IIS website and stored in the MDS database.
DQS
DQS is a technology providing you with a way to deduplicate and correct data that originates from several sources. Actually, DQS is not a service installed within the SQL Server installer, but it's created by an independent application.
The SSIS package has a special control flow task called the DQS cleansing task that is used when some of the DQS knowledge base (a set of rules created by the data steward) has to be used for data cleansing before the data is written to a target.
Developing solutions with SSIS, and optionally with MDS and DQS, needs complex developers' expertise not primarily needed by SQL Server administrators. From the administrator's perspective, SSIS, along with MDS and DQS, is just another database maintained by SQL Server.
SSAS
SSAS is a very robust and scalable service that steps behind relational database limits by pre-calculating data that has been read from a relational data source. SSAS stores the data in multidimensional storage called a storage model.
This approach is even more efficient for further analysis and visualizations than just the usage of relational data because the multidimensional format allows users to drill down and pivot actions as well as advanced aggregations or period-to-date queries. From this perspective, SSAS forms the core component of corporate as well as self-service BI solutions.
Analysis Services can be installed within SQL Server installer, but it is not always a good idea to have both the SQL Server DE and SSAS service installed on the same computer. We must remember that SSAS is an extremely complicated engine with a lot of physical input/output (I/O) operations when accessing a storage mode. A lot of memory cache is used for data processing and data querying and entails significant central processing unit (CPU) consumption for computations. One more important thing is that results from SSAS are often consumed in applications such as decision support, management reports, and so on, and it's crucial to get responses fast without waiting.
As mentioned previously, in many cases SSAS has to be installed on its own computer. The only disadvantage is that separate installations of SQL Server services lead to separate licensing and more complex maintenance needs. In other words, the more computers that are used to spread SQL Server technologies across an infrastructure, the more licensing expenses will grow.
SSAS can be installed in two distinct modes, as follows:
- Multidimensional mode: This mode is used for centrally created data cubes and mining models.
- Tabular mode: This mode is also called in-memory mode. It's used to host PowerPivot models.
If both modes are needed, the SQL Server installer must be executed twice, and two instances of SSAS have to be installed.
Multidimensional mode of SSAS installation
The multidimensional mode is used for corporate BI scenarios. IT departments develop dimensions, data cubes, and mining models.
The multidimensional mode requires regular data processing, so its approach is for bigger centralized analysis, trend predictions, longitudinal studies, and more. The multidimensional mode is seen as a bigger, robust, and scalable mode, but often with data delay. (An existing storage model called Relational online analytical processing (ROLAP) can be used for real-time analysis but has a lot of constraints. An overview of a real-time operational analysis scenario will be described later in this book.)
Tabular mode of SSAS installation
SQL Server, as well as other Microsoft technologies, supports BI solutions created by business users. This approach is intended for users who are subject matter experts more than IT experts, who have simple but strong enough tools to create their own analysis and visualizations. The toolset is known as Power BI. A part of Power BI is the PowerPivot technology—compressed and somehow pre-calculated data used to build data models similar to data cubes.
For the possibility of sharing our own data models with other users in a well-managed and secured environment, the PowerPivot mode of SSAS was originated. Data models can be deployed with almost no adjustments to the server environment and can then be accessed by authorized users.
One big advantage of PowerPivot mode is that data models are held in memory, and when some additional data is needed to fulfill user requests, it can be read from the data source directly.
Although a detailed description of how analysis services work is beyond the scope of this book, we must know that combining analysis services—no matter which installation mode—with other SQL Server services leads to big performance problems.
SSRS
Data, either relational or multidimensional, does not have its own visible face—data is not visual. To have a complete end-to-end data management platform, Microsoft offers a service called SSRS as a part of the SQL Server technology set. This service is designated to access data from a variety of sources and visualize the data to users. SSRS is a favorite service for centralized and managed reporting.
From an architectural point of view, SSRS is a Windows (or newly Linux) service that offers HTTP/HTTPS endpoints for human-readable as well as web service content consuming. The human-readable endpoint is called Report Portal. It is just a web application for report consumption and management (formerly, Report Portal was called Report Manager).
SSRS has many useful features, including report deployment, report previews, subscriptions, or report exports to formats such as MS Excel or PDF.
Note
SQL Server 2019 Reporting Services installation is no longer a part of the SQL Server installer. From now, SSRS is installed and versioned separately. Linking to the installer is accessible from the SQL Server installation center or the setup wizard step with Feature Selection.
When SQL Server 2016 was up to date, it had two installation modes for reporting service—Native mode and SharePoint mode. However, in SQL Server 2019, SharePoint mode is no longer offered.
When installing SSRS, the web installer allows only installation of the service itself without creating the ReportServer and ReportServerTempdb databases for services metadata, as illustrated in the following screenshot:
In production environments, it is a better option to install an instance of SQL Server DE on its own computer and then install SSRS on its own computer as well. Metadata databases are created later by a visual configuration tool called Reporting Services Configuration Manager.
Machine Learning Services
Predictive analysis profits from efficient and enlarged languages such as Python or R. SQL Server 2016 was the first version of SQL Server that incorporated new features called R Services. This feature is not seen in SQL Server 2019 installation anymore because it was renamed as Machine Learning Services. The renaming reflects the new Python support in SQL Server 2017. SQL Server 2019 offers Java as a third usable language in Machine Learning Services.
Machine Learning Services can be installed via in-server mode. In this installation mode, Java, Python, and R support is incorporated directly into SQL Server DE. When the in-server mode is selected, developers can call the SQL Server stored procedure, sp_execute_external_script, with an R command, a Java command, or a Python command as a parameter.
The second possible mode of installation is the Machine Learning standalone server, which is an independent server consuming and executing R as well as Python scripts and visualizations.
SQL Server Agent
SQL Server Agent's installation is done along with the SQL Server DE installation. The only exception is that SQL Server Express Edition does not allow us to use the SQL Server Agent service.
From an administrator's point of view, SQL Server Agent is a service to plan, execute, and monitor regular tasks (jobs). But the service is used by many other components and services of SQL Server; for instance, SSRS uses SQL Server Agent jobs to deliver reports to end users and more. The first approach that we could consider is the planning and execution of regular administration tasks such as those contained in maintenance plans (backups, reindexing, and so on). However, SQL Server and its services also need to execute other automated actions—for example, the following:
- MDS jobs for the internal maintenance of the MDS database
- Reporting Services jobs for regular subscriptions, report snapshots, and report cache housekeeping
- SQL Server replications internally represented as sets of jobs
- When data collection diagnostics are configured, collection jobs are created and executed
Note
We will discuss the features of SQL Server Agent throughout this book.
In this section, we have seen the application of and the need for certain SQL Server services and features. We have also seen why it is important to install the servers on different operating systems. Now, let's go on to prepare our computers to start the SQL Server 2019 installation on Windows.