Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Harnessing the power of both worlds
1. Author: Karan Gulati (SSAS Maestro)
HARNESS THE POWER OF BOTH WORLDS
Harness the Power of Both Worlds –SQL Engine and Analysis Services
Introduction
The combination of the SQL Engine and Analysis Services is awesome because it lets you harness the
power of both worlds: columnstore Indexes in the SQL engine and real-time or relational OLAP (ROLAP)
in Analysis Services.
Recently I have done some testing around a much talked about feature of the SQL engine – columnstore
indexes. As you can learn in BOL, in SQL Server columnstore indexes store and manage data by using
column-based data storage and column-based query processing in-memory. Columnstore indexes work
well for data warehousing workloads that primarily perform bulk loads and read-only queries. By using
the columnstore index you can achieve up to 10x query performance gains over traditional row-
oriented storage, and up to 7x data compression over the uncompressed data size.
A columnstore index is a technology for storing, retrieving and managing data by using a columnar data
format, called a columnstore. SQL Server supports both clustered and nonclustered columnstore
indexes. Both use the same in-memory columnstore technology, but they do have differences in
purpose and in features they support.
Benefits
Columnstore indexes work well for mostly read-only queries that perform analysis on large data sets.
Often, these are queries for data warehousing workloads. Columnstore indexes give high performance
gains for queries that use full table scans, and are not well-suited for queries that seek into the data, or
that search for a particular value.
Columnstore indexes have these benefits:
Columns often have similar data, which results in high compression rates.
High compression rates improve query performance by using a smaller in-memory footprint. In
turn, query performance can improve because SQL Server can perform more query and data
operations in-memory.
A new query execution mechanism called batch-mode execution has been added to SQL Server
that reduces CPU usage by a large amount. Batch-mode execution is closely integrated with, and
optimized around, the columnstore storage format. Batch-mode execution is sometimes known
as vector-based or vectorized execution.
Queries often select only a few columns from a table, which reduces total I/O from the physical
media.
Solution with ROLAP and Columnstore Index
So how can you leverage this new shining star of SQL Server for business intelligence solutions?
I began by combining clustered columnstore indexes with Analysis Services ROLAP feature, enabling my
fact tables for ROLAP, and then tested the performance of aggregations. As expected, I got pretty
awesome performance with 500 million records.
2. Author: Karan Gulati (SSAS Maestro)
HARNESS THE POWER OF BOTH WORLDS
So I doubled the number of records to 1 billion! At this scale, basic aggregates such as SUM and COUNT
were returned in 2 to 3 seconds. The acid test of course lies in a certain trouble -making measure – you
guessed it, Distinct Count.
So I did testing with Distinct Counts on the ROLAP solution and results were pretty positive, with around
1.5 billion records (1555618627), and around 15,638,865 distinct values. On a cold cache I got results
within 2 seconds and on a warm cache in 16 milliseconds.
This was pretty awesome so I decided to try the combination of warm-caching with ROLAP in order to
meet my SLA for the solution dashboards, which was around 4 seconds.
Recommendations
There are some important settings that you need to consider when implementing a BI solution based on
in-memory columnstore indexes and Analysis Services ROLAP:
EnableRolapDistinctCountOnDataSource – This is a very important property when we are using
Analysis Services ROLAP. Set this property to 1 when using ROLAP with distinct count
calculations. The property enables calculation of distinct count at the source (the relational
database) instead of fetching rows in Analysis Services and calculating distinct count using
Analysis Services.
Statistics – In the absence of SQL Statistics Query Optimizer you will not be able to generate the
right execution plan of a given query. Therefore it is very important to do timely updates of the
statistics for the columnstore table; otherwise DML operations can degrade performance.
Hardware
For this experiment I used a Virtual Machine with 16 logical processors and 120 GB Ram.
References
Columnstore Indexes: http://msdn.microsoft.com/en-us/library/gg492088.aspx