Azure Cosmos DB is Microsoft’s globally distributed, multi-model database service.

  • With a click of a button, Cosmos DB enables you to elastically and independently scale throughput and storage across any number of Azure regions worldwide.
  • You can elastically scale throughput and storage, and take advantage of fast, single-digit-millisecond data access using your favorite API including SQL, MongoDB, Cassandra, Tables, or Gremlin.
  • Cosmos DB provides comprehensive service level agreements (SLAs) for throughput, latency, availability, and consistency guarantees, something no other database service offers.
  • By virtue of deep integration with Azure infrastructure and transparent multi-master replication, Cosmos DB provides 99.999% high availability for both reads and writes. Cosmos DB also provides you with the ability to programmatically (or via Portal) invoke the regional failover of your Cosmos account. This capability helps ensure that your application is designed to failover in the case of regional disaster.

Consistency Levels

Azure Cosmos DB approaches data consistency as a spectrum of choices instead of two extremes. Strong consistency and eventual consistency are at the ends of the spectrum, but there are many consistency choices along the spectrum. Developers can use these options to make precise choices and granular tradeoffs with respect to high availability and performance.

With Azure Cosmos DB, developers can choose from five well-defined consistency models on the consistency spectrum. From strongest to more relaxed, the models include strong, bounded staleness, session, consistent prefix, and eventual consistency. The models are well-defined and intuitive and can be used for specific real-world scenarios. Each model provides availability and performance tradeoffs and is backed by the SLAs. The following image shows the different consistency levels as a spectrum.

 

Consistency as a spectrum

While you estimate the number of RUs per second to provision, consider the following factors:

  • Item size: As the size of an item increases, the number of RUs consumed to read or write the item also increases.

  • Item indexing: By default, each item is automatically indexed. Fewer RUs are consumed if you choose not to index some of your items in a container.

  • Item property count: Assuming the default indexing is on all properties, the number of RUs consumed to write an item increases as the item property count increases.

  • Indexed properties: An index policy on each container determines which properties are indexed by default. To reduce the RU consumption for write operations, limit the number of indexed properties.

  • Data consistency: The strong and bounded staleness consistency levels consume approximately two times more RUs while performing read operations when compared to that of other relaxed consistency levels.

  • Query patterns: The complexity of a query affects how many RUs are consumed for an operation. Factors that affect the cost of query operations include:

    • The number of query results
    • The number of predicates
    • The nature of the predicates
    • The number of user-defined functions
    • The size of the source data
    • The size of the result set
    • Projections

    Azure Cosmos DB guarantees that the same query on the same data always costs the same number of RUs on repeated executions.v

Anuj holds professional certifications in Google Cloud, AWS as well as certifications in Docker and App Performance Tools such as New Relic. He specializes in Cloud Security, Data Encryption and Container Technologies.

Initial Consultation

Anuj Varma – who has written posts on Anuj Varma, Hands-On Technology Architect, Clean Air Activist.