Choosing the right partition key for Azure Cosmos DB is the single most critical decision you’ll make for performance and cost.
Let’s see how a poorly chosen partition key can bring your database to its knees. Imagine we have a Cosmos DB container storing IoT device telemetry. Our initial thought is to partition by deviceId.
Here’s a simplified look at the data:
{
"id": "device1-telemetry-1",
"deviceId": "device1",
"timestamp": "2023-10-27T10:00:00Z",
"temperature": 25.5
}
{
"id": "device1-telemetry-2",
"deviceId": "device1",
"timestamp": "2023-10-27T10:01:00Z",
"temperature": 25.6
}
{
"id": "device2-telemetry-1",
"deviceId": "device2",
"timestamp": "2023-10-27T10:00:00Z",
"temperature": 22.1
}
If we partition by deviceId, and device1 is a very active device sending data every second, while device2 only sends data once an hour, all the requests for device1 will hit the same physical partition. This is a "hot partition."
This is what a hot partition looks like in the Azure portal:
[Imagine a screenshot here showing Request Units per second on the Y-axis and Time on the X-axis. One partition shows a consistently high, spiky line, while others are nearly flat.]
Notice how one partition is consistently hammered, while others are mostly idle. This single partition is the bottleneck. It has a finite capacity for Request Units (RUs) and storage. When it reaches its limit, requests targeting it will be throttled (HTTP 429 errors), and your application will experience latency.
The goal of a good partition key is to distribute your data and requests evenly across as many physical partitions as possible. Cosmos DB automatically creates new physical partitions as your data grows and your throughput needs increase, up to a certain limit per logical partition. A good partition key ensures that these new physical partitions are utilized effectively.
How to avoid hot partitions:
-
High Cardinality is Key: Choose a property that has a large number of distinct values. If you have 100,000 devices, partitioning by
deviceIdseems good, but if 90% of your traffic comes from 10 devices, it’s not.- Diagnosis: Monitor your RU consumption per physical partition in Azure Monitor. Look for extreme outliers.
- Fix: If
deviceIdis the problem, and you can’t change your application logic to generate a more distributeddeviceId, consider a composite key. For example, if you have a concept ofregionorsiteId, you might partition bydeviceId+regionId. This distributes traffic across devices and regions. - Why it works: This spreads the load of a single high-traffic device across multiple physical partitions if that device operates in different regions.
-
Avoid Time-Based Keys (Alone): Partitioning by
timestampordateis a common mistake. All recent data will land on the same partition, creating a hot spot for writes.- Diagnosis: Look for a constantly increasing RU consumption on one or a few partitions over time.
- Fix: If your access patterns are primarily time-based, consider a compound key that includes a high-cardinality identifier. For example,
deviceId+yyyy-MM-ddwill distribute time-series data for different devices across partitions. - Why it works: This ensures that even if you’re writing data for today, the writes are spread across partitions based on the
deviceId.
-
Analyze Your Query Patterns: How do you read and write data? If you always query by
userId, thenuserIdis a strong candidate for your partition key. If you rarely query byuserId, it’s a bad choice.- Diagnosis: Review your application’s common query patterns. Use the Cosmos DB diagnostic logs or APIM logs to identify frequently used filter clauses.
- Fix: If your queries are often by
deviceIdandsensorType, anddeviceIdalone isn’t enough, consider a composite partition key like/deviceId/sensorType. - Why it works: Queries that include the partition key (or its prefix for composite keys) can be routed directly to the relevant physical partition, making them highly efficient.
-
Consider Synthetic Keys: If no single natural property provides enough cardinality or distribution, create a synthetic key.
- Diagnosis: You have a natural key (e.g.,
orderId) but your access patterns are not directly on it, or it doesn’t have enough unique values. - Fix: In your application, generate a random integer (e.g.,
1to1000) and store it as ashardIdproperty. Use/shardIdas your partition key. - Why it works: This artificially spreads your data and traffic across a predefined number of partitions, ensuring even distribution regardless of the natural key’s characteristics.
- Diagnosis: You have a natural key (e.g.,
-
Understand Composite Partition Keys: You can use a combination of properties. The order matters.
- Diagnosis: You have two properties that, when combined, offer better distribution than either alone.
- Fix: When creating your container, specify a composite partition key. For example, using the Azure CLI:
az cosmos db sql container create \ --account-name mycosmosdbaccount \ --resource-group myresourcegroup \ --database-name mydatabase \ --name mycontainer \ --partition-key-path "/deviceId,/region" \ --partition-key-kind Hash # or Range - Why it works: Cosmos DB hashes the combined values of
/deviceIdand/regionto determine the physical partition, providing a wider distribution of data and requests.
-
Be Wary of Low-Cardinality Properties: Properties like
status(e.g., 'Pending', 'Completed'),type(e.g., 'User', 'Admin'), orcountry(if most users are from one country) are poor partition keys.- Diagnosis: Monitoring shows 2-3 partitions consuming almost 100% of RUs, while others are idle.
- Fix: If you’ve already created a container with a bad key, the only way to fix it is to create a new container with the correct partition key and migrate your data. You can then delete the old container.
- Why it works: A new container with a well-chosen key allows you to start with a clean slate, distributing data and requests evenly from the beginning.
Even with a good partition key, it’s possible to encounter temporary hot spots if a single logical partition receives an unexpected burst of traffic. Cosmos DB will automatically scale up the physical partition to accommodate this, but it takes a moment. If you’re consistently hitting limits on a specific logical partition, it’s time to re-evaluate your partition key choice.
The next challenge you’ll face is understanding how to optimize your queries to take full advantage of this partitioning strategy.