Configuring Log Analytics storage options
Once you have completed the configuration of a few data connectors, you will begin to see how much data you will ingest and store in Log Analytics on a daily basis. The amount of data you store and retain directly impacts the costs—see Chapter 1, Getting Started with Azure Sentinel, for further details. You can view the current usage and costs by navigating to the Log Analytics workspace, then selecting Usage and estimated costs from the General menu, as shown in the following screenshot:
Once it's selected, you are then presented with a dashboard of information that will show the pricing tier and current costs on the left-hand side and graphs on the right-hand side, to show the variation in consumption on a daily basis for the last 31 days. A second graph shows the total size of retained data, per solution. An example of the dashboard is shown in the following screenshot:
From this page, explore two of the options available along the top menu bar:
- Daily cap: With this option, you can enable or disable the ability to limit how much data is ingested into the Log Analytics workspace on a per-day basis. While this is a useful control to have to limit costs, there is a risk that the capped data will result in a loss of security information that's valuable for detecting threats across your environment. We recommend only using this for non-production environments.
- Data Retention: This option allows you to configure the number of days data should be retained in the Log Analytics workspace. The default for Log Analytics is 31 days; however, when Azure Sentinel is also enabled on the Log Analytics workspace, the default free amount is 90 days. If you choose to increase beyond 90 days, you will be charged a set fee per gigabyte (GB) per month.
In the next section, we will look at how we calculate the costs involved in data ingestion and retention for Azure Sentinel and Log Analytics.
Calculating the cost of data ingestion and retention
Many organizations need to retain security log data for longer than 90 days, and budget to ensure they have enough capacity based on business needs. For example, if we consider the need to keep data for 2 years, with an average daily ingestion rate of 10 GB, then we can calculate the cost of the initial ingestion and analysis, then compare it to the cost of retention. This will provide an annual cost estimate for both aspects.
The following table shows the cost for ingesting data into Log Analytics and analyzing that data in Azure Sentinel. This price includes 90 days of free retention:
The following table shows the amount of data being retained past the free 90 days included in the preceding pricing, based on ingesting 10 GB per day:
Now, if we add these together, we can see the total cost of the solution over a 12-month period, shown in the following table:
Note
These prices are based on the current rates applicable to the US East Azure region, and figures are rounded to simplify. Actual data usage may fluctuate each month.
Based on these examples, the total cost of running Azure Sentinel, ingesting 10 GB per day and retaining data for 2 years, would be $39,780. Data retention accounts for 22% of the cost.
Because the charges are based on the volume of data (in GB), one way of maintaining reasonable costs is to carefully select which data is initially gathered, and which data is kept long term. If you plan to investigate events that occurred more than 90 days ago, then you should plan to retain that data. Useful log types for long-term retention include the following:
- IAM events such as authentication requests, password changes, new and modified accounts, group membership changes, and more
- Configuration and change management to core platforms, network configuration, and access controls across boundaries
- Creation, modification, and deletion of resources such as virtual machines, databases, and cloud applications (Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS) resources)
Other data types can be extremely useful for initial analysis and investigation; however, they do not hold as much value when the relevance of their data reduces. They include the following:
- Information from industrial control solutions
- Events streamed from IoT devices
Also, consider that some platforms sending data to Azure Sentinel may also be configured to retain the original copies of the log data for longer periods of time, potentially without additional excessive costs. An example would be your firewall and CASB solutions.
Reviewing alternative storage options
The benefit of retaining data within Log Analytics is the speed of access to search the data when needed, without having to write new queries. However, many organizations require specific log data to be retained for long periods of time, usually to meet internal governance controls, external compliance requirements, or local laws. Currently, there is a limitation, as Log Analytics only supports storage for up to 2 years.
The following solutions may be considered as an alternative for long-term storage, outside of Log Analytics:
- Azure Blob Storage: You can create a query in Azure Monitor to select the data you want to move from the Log Analytics workspace and point it to the appropriate Azure Storage account. This allows filtering of the information by type (or select everything), and only moving data that is about to come to the end of its life, which is the limit you have set for data retention in the Log Analytics workspace. Once you have the query defined, you can use Azure Automation to run PowerShell and load the results as a CSV file, then copy to Azure Blob Storage. With this solution, data can be stored for up to 400 years! For further information, see this article: https://docs.microsoft.com/en-us/azure/azure-monitor/platform/powershell-quickstart-samples.
- Azure SQL: Data that is stored in Azure Blob Storage can be then be ingested into Azure SQL (Database or Data Warehouse), which enables another method for searching and analyzing the data. This process utilizes Azure Data Factory to connect the Azure Blob Storage location to Azure SQL Database/Data Warehouse, automating the ingestion of any new data. For further information, see this article: https://azure.microsoft.com/en-us/documentation/articles/data-factory-copy-data-wizard-tutorial/.
- Azure Data Lake Storage Gen2: As an alternative to Azure Blob Storage, this option enables access to the data via open source platforms such as HDInsight, Hadoop, Cloudera, Azure Databricks, and Hortonworks. The data does not need to be transferred (as with Azure SQL), and this solution provides easier management and increased performance and is more cost-effective. For further information, see this article: https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction.
As you can see, there are plenty of options available to store the data in alternative locations, both for extended archive/retention and for additional analysis with alternative tools. We expect Microsoft will increase the number of options available.