As we all are aware, SQL 2012 introduced columstore indexes which is based Microsoft’s xvelocity in-memory engine and as you might have heard and read in numerous articles it is known to provide 5-50X improvement in performance of queries.
Reading and Hearing 5-50X improvement in performance of queries causes many eyes to pop-out and they begin to consider building columnstore indexes in most of the tables assuming it would provide significant improvement in their application. In this blog post, I would like to emphasize on when should one consider building columnstore index and further ensuring whether the query is benefited from batch processing mode of columnstore index.
Let me start with
When should you not consider creating Columnstore index
- On small tables with fewer rows, typically in thousands of records one should not create Columnstore index.
- If the queries filters on the columns with high selectivity with no aggregation. In order words if the query plan already perform Seek operation.
- If the table is highly transactional which requires frequent inserts, updates and deletes.
The above points are all characteristics of an OLTP database and hence columnstore indexes are not good candidate for OLTP system which consists of normalized transactional databases.
When should you consider creating Columnstore index
- On large tables, containing large number of records typically containing millions of records.
- Queries which perform aggregations, sorting, group by, order by which requires full table scan of the tables
- The table should refresh periodically and should be transactional in nature.
The above points are all characteristics of an DW database with every large fact table in which measures are aggregated grouped by dimension columns and in which the tables are refreshed less frequently by ETL operations.
Here is how I interpret columnstore indexes, while seeks are optimized by clustered and nonclustered indexes, Tables Scans, Index Scans, Aggregations, Sorts are optimized by using Columnstore indexes.
However, Columnstore index alone doesn’t provide the 5-50X performance improvement, the performance gains is observed when columnstore index scan is used in conjunction with batch processing which is the new vector based query execution engine designed to optimize columnstore index scan. Batch processing operates on 1000 rows batch at a time which provides that eye popping performance improvement which you might be aiming at.
Batch processing is supported by Columnstore Index Scan, Hash Aggregates, Hash Join, Hash Build, filter operators in the query plan and hence when you would like performance tune the query using columnstore indexes, you need to focus on achieving the batch processing mode and avoid row processing mode.
Batch Processing and Segment Elimination alongwith Columnstore Index Scan can provide maximum performance gain.
If you have gone through Eric Hanson’s session from TechEd here , he has clearly specified under following conditions, the query might not go batch processing mode
- Outer Joins
- IN & Exists
- UNION ALL
In the same session, Eric has shown workarounds to rewrite the query to achieve batch processing mode which gives significant performance improvement.
I encountered one such query in which query didn’t go for batch processing mode by default and I had to rewrite the query to force it to go for batch processing mode.
Consider the following Demo
Use Adventureworks2012 GO IF OBJECT_ID('dbo.Demo') is not NULL DROP TABLE Demo SELECT * INTO Demo FROM Sales.SalesOrderDetail GO DECLARE @i int = 1 WHILE(@i<=5) BEGIN INSERT INTO Demo(SalesOrderID,CarrierTrackingNumber,OrderQty,ProductID,SpecialOfferID,UnitPrice,UnitPriceDiscount,LineTotal,rowguid,ModifiedDate) SELECT SalesOrderID,CarrierTrackingNumber,OrderQty,ProductID,SpecialOfferID,UnitPrice,UnitPriceDiscount,LineTotal,newi(),ModifiedDate FROM Sales.SalesOrderDetail SET @i=@i+1 END CREATE CLUSTERED INDEX IDX_CLUS_Demo ON Demo(SalesOrderDetailId) GO CREATE NONCLUSTERED INDEX IDX_NONCLUS_Demo ON Demo(SalesOrderId,Productid) GO CREATE NONCLUSTERED COLUMNSTORE INDEX IDX_CS_Demo ON Demo ( SalesOrderID, SalesOrderDetailid, CarrierTrackingNumber, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount ) GO
Now when I fire the following query
SELECT s.OrderDate, s.ShipDate, sum(d.OrderQty),avg(d.UnitPrice),avg(d.UnitPriceDiscount) FROM Demo d join Sales.SalesOrderHeader s on d.SalesOrderID=s.SalesOrderID WHERE d.OrderQty>500 GROUP BY s.OrderDate,s.ShipDate
If I go by the rules which I described above, the query contains inner join between a very large table (Demo) and smaller table SalesOrderHeader and the query doesn’t use OuterJoin, IN or Union All so it should go for batch processing mode.
But to my surprise, the query was going for row processing mode and the elapsed time for the query was around 300 ms on my machine.
I started wondering why it is not using the batch processing mode and Nested Loop Join caught my attention since Loop Join doesn’t support Batch Processing Mode which might be forcing the query to go for row processing mode.
So I used the query hint OPTION(hash Join) to force hash Join and Voila there I see my Batch Processing Mode used by the query plan.
With Batch Processing Mode, my query completed within the flash of an eye and it was blistering fast.
From this example, my intention is not to promote query hints but we need to understand that with SQL 2012, optimizer code may not be as matured to account for all possible ways to execute batch processing mode. Hence we need to rewrite our code or use such hints which will influence the optimizer to choose batch processing since batch processing and segment elimination can give you that mind blowing performnace improvement which you might be aiming for with columnstore index.
Premier Field Engineer
Thanks for your interest in SQL Server Column Stores.
Let me explain the observations you have made. The SQL Query Optimizer proceeds in stages with each stage willing to spend greater resources in the search for an efficient execution plan. Early stages of optimization do not consider batch execution for joins and if the cost of any serial plan is below the “cost threshold for parallelism” (default = 5) then no parallel plan will be considered. Batch processing is currently only supported for parallel execution. Since the row based loop join plan is costed below 5 (I’m inferring) we don’t consider the batch plan, even though we would choose it if we considered it.
Campbell Fraser, SQL Development
Can we force SQL server optimizer to use batch mode without giving hints in SQL queries? Is there any configuration for that?
Hey there! Do you use Twitter? I’d like to follow you if that would be ok. I’m definitely enjoying your blog
and look forward to new posts.
Thanks for a great post. In one of a scenario of our customer though the hash join was used still the query was using row mode. The query has just inner join and based on the performance tuning guide the query followed all the principles. The customer also observed rampant behavior as for some of the combination of columns the batch mode kicks in but for some it doesn’t. The distinct values of all the columns being used are between 5 to 600 and the total number of rows is 65 million. Any clue what might be happening?
We need to investigate your scenario since this is the first time hearing about it. So when does the batch mode kick in and when u include\exclude which columns does it goes back to row mode.
Do you have repro for the issue. Also ensure you are on the latest service pack just in case if you are hitting any bugs or known issues