“Using the word cosmos rather than the word universe implies viewing the universe as a complex and orderly system or entity; the opposite of chaos” https://en.wikipedia.org/wiki/Cosmos
Source: https://azure.microsoft.com/en-gb/blog/a-technical-overview-of-azure-cosmos-db/
For the seasoned database professional, the NoSQL movement can feel like chaos. However, in a cloud-first world, we all need to ensure we choose the best service for the job and not just the database with which we are familiar and comfortable.
Here are five reasons you might choose Cosmos DB for your next project:
Multi-Model
A relational data structure is not always the best choice for your data scenario. Carefully consider your development project and decide if a normalised data structure in rows and columns is the best choice. Cosmos DB is a multi-model database that supports the following API:
- SQL
- Gremlin
- MongoDB
- Cassandra
- Azure Table.
Source: https://docs.microsoft.com/en-us/azure/cosmos-db/introduction
All options offer incredible flexibility. You can simply add additional properties to your data without requiring an “ALTER TABLE ADD COLUMN;” command. If SQL is your language of choice, that is fine. Query JSON documents via familiar SQL commands as if by magic.
The Gremlin API is a great choice to find relationships between entities. Gremlin uses a graph structure of:
- Vertices – the nouns in your domain, e.g. product, store, or person
- Edges – define the relationships between vertices.
There are many use-cases for graph databases, including, fraud detection, social media and recommendation engines.
The MongoDB, Cassandra and Azure Table API are fantastic options to support a migration project for an existing application. The choice of model is perfect if you wish to take advantage of Cosmos DB without a large amount of code re-factoring. Move your data using the convenient migration tool, change your connection string, and you completed the job.
Cosmos DB stores data with an Atom-Record-Sequence (ARS) structure. You don’t need to know about that low-level construct as the Cosmos engine translates the data into the correct format of Document, Graph, Key-Value, or Column based on your choice of API.
Elastic Scale
You pay for Cosmos DB via a combination of storage and Request Units per second (RU/s). One RU is the CPU, disk I/O, and memory required to read 1 KB of data in 1 second.
As your business grows and your throughput requirements increase, simply scale-up the RU via the Azure portal or automatically via the Azure CLI.
If your throughput requirements exceed your defined RU, Cosmos will throttle operations and users will receive a degraded experience. A great way to alleviate this issue is the Autopilot mode. On Autopilot, your database scales up and down depending on requirements and a maximum cap you define.
Global Distribution
You can’t overcome the speed of light, so put the data where your users are so everyone gets a great experience. Maybe you’re a global retailer, and you want your customers twelve time zones away to receive the same levels of performance as users near your head office?
Adding additional writeable replicas in global Azure regions is ridiculously easy with Cosmos DB. Select the data centre on a map and click save.
Source: Azure Portal
As SQL Server professionals, we have wrestled with Availability Groups, Log Shipping and Replication to create secondary copies for years. All these technologies have use-cases, but believe me, none are one-click deployments to configure. All need hardware, licences, careful planning, configuration and on-going maintenance by experts.
Cosmos DB even supports data writes in each of your regions and handles conflicts for you with three options, giving you a choice on which write prevails in the event of a conflict:
- Last-Writer-Wins (LWW)
- Custom - User-defined function
- Custom – Async.
In addition to performance, you want additional data copies for business continuity. Azure regions come in pairs, e.g. East US 2 and Central US or UK West and UK South. Click the “Enable geo-redundancy” button, and you have your second copy. It feels like a modern high availability and disaster recovery solution you might expect has finally arrived.
Low Latency
Microsoft provides many service level agreements (SLA) for Cosmos DB, the most enticing of which are the latency guarantees.
Cosmos DB supports five consistency levels and not just the binary Strong vs Eventual. The levels are:
- Strong
- Bounded staleness
- Session
- Consistent prefix
- Eventual.
The options provide a spectrum of high-availability, latency, throughput and cost.
Microsoft provides the SLA:
- <10ms read latency at the 99th percentile (typically 2ms at the 50th percentile)
- <10ms write latency at the 99th percentile (typically 5ms at the 50th percentile).
The SLA does not apply to the strong consistency model across multiple regions. Strong consistency guarantees users always read the most recent version. As you disperse data globally, Cosmos cannot guarantee 10ms latency when you choose strong consistency.
Cost
Cosmos DB bills you for storage and Request Units per second (RU/s). One RU is the CPU, disk I/O, and memory required to read 1 KB of data in 1 second.
To guarantee the latency SLA mentioned above, you reserve an amount of RU/s and pay for that amount. In a multi-master and multi-region scenario, you pay £8.71 per month for each 100 RU/s (UK South Data Centre). Storage costs £0.19 GB/Month for transactional storage.
The number of RU each of your queries requires depends on some factors. These factors include your global distribution, indexing strategy, data volume and consistency level. The great thing is Cosmos exposes the RU data for each query you submit. There are also fantastic visualisations of your usage patterns in the metrics section of your Cosmos account in the Azure portal. This information allows you to create a proof of concept and estimate the cost of your project.
Remember too, the total cost of ownership and Microsoft fully manages Cosmos DB. You try standing up global writeable replicas of data stores on your own, the effort and cost would be very high.
Try Cosmos DB for free
If you haven’t tried Microsoft Learn, give this excellent free training resource a go. There is a fantastic course – “Work with NoSQL data in Azure Cosmos DB” at https://docs.microsoft.com/en-us/learn/paths/work-with-nosql-data-in-azure-cosmos-db/
Microsoft even provides free Azure sandboxes throughout the course for you to experiment with concepts. There is nothing like doing to reinforce your learning.
There is also a free tier of 400 RU/s and 5GB storage, so you have no excuse not to give Cosmos DB a try!