In-memory ColumnStore Indexes in SQL Server 2014 can be a performance game changer for workloads involving large datasets. They can also significantly affect the amount of server resource needed to provide good query performance – but not always in the ways you’d expect.
It’s common to see a transactional system’s database with lots of relatively small tables and a small number of very large tables - history logs or archives. The activities for those large tables is often mostly small inserts with occasional large read-only queries; queries that are often the database server’s most expensive workloads using all of their processor and memory resources. Maintaining consistent application performance when there are extreme variations in workloads is a common problem a lot of organisations have unsuccessfully tried solving by just adding more server resources.
New in SQL Server 2012 was one of Microsoft’s transactional in-memory technologies, ColumnStore indexes. For the purpose of this article, they’re an alternative method for storing and querying very large amounts of SQL Server table data that in the right situation is a lot more efficient than SQL Server’s traditional methods. In SQL Server 2012, these indexes weren’t updatable which mostly limited their use to data warehouse scenarios. In SQL Server 2014, they became updatable allowing their use to spread beyond the read-only world and towards transactional systems.
The link below is to a presentation I gave two years ago about the then recently released ColumnStore technology: here
Recently, I was helping a client evaluate the new features in SQL Server 2014 including ColumnStore indexes. One of their application systems had exactly the problem I described above, specifically the following environment:
Table rows |
Table size |
Server CPU resources |
Server memory |
~1 billion |
~36GB |
2 vCPUs |
32GB |
Test query run time : ~30 seconds
A common application activity involved querying the billion row table that SQL Server performed using an expensive parallel clustered index scan. Consequently, the server’s storage was used heavily and the database server’s memory was fairly ineffective . Perhaps surprisingly, the server’s CPU usage was always fairly low.
This style of this workload made it a good candidate for trialling SQL Server 2014’s Clustered ColumnStore Index feature so we proceeded with a proof of concept.
Having converted the large table from a traditional clustered index to a ColumnStore clustered index, we had the following environment:
Table rows |
Table size |
Server CPU resources |
Server memory |
~1 billion |
~3GB |
2 vCPUs |
32GB |
Test query run time : ~200 seconds
The table above shows four observations significant to this PoC:
Using the above observations, we concluded that:
For the purpose of this PoC, we then made the following changes:
Table rows |
Table size |
Server CPU resources |
Server memory |
~1 billion |
~3GB |
8 vCPUs |
6GB |
Test query run time : ~5 seconds
The table above shows four observations significant to this PoC:
The PoC was a very fast paced trial of a new SQL Server feature with the objective of understanding its potential relevance to the client’s environment. However, our observations went beyond the “you’ll need less disk space” conclusion we expected. Instead, what we learnt was that:
ColumnStore indexes are a great tool in the toolbox when looking to solve performance issues; however, their performance doesn’t depend on storage and memory any more – it depends on something potentially far more expensive – CPU resource.