Cosmos DB’s multi-region write feature, while incredibly powerful for global availability, can sometimes lead to write conflicts. This happens when two or more clients attempt to update the same document concurrently in different regions, and the conflict resolution policy can’t automatically reconcile the changes.
Let’s see this in action. Imagine two users, one in New York and one in London, both trying to update the quantity of the same product in an inventory database.
// Document before concurrent writes
{
"id": "product-123",
"name": "Wireless Mouse",
"quantity": 10
}
User in New York: Reads quantity = 10. Decrements quantity by 1. Writes quantity = 9.
User in London: Reads quantity = 10. Decrements quantity by 1. Writes quantity = 9.
Without a proper conflict resolution strategy, Cosmos DB might end up with quantity = 9 in one region and quantity = 9 in another, but the system doesn’t inherently know which write was "correct" or how to merge them if other fields were also changed. This leads to a write conflict.
The core problem Cosmos DB solves with multi-region writes is maintaining low-latency data access and high availability for a globally distributed user base. When a write happens in a region, it needs to be propagated to other regions. If two writes targeting the same resource occur in different regions before the first write has been fully replicated, a conflict arises.
Cosmos DB has a built-in mechanism to handle these conflicts: Conflict-Free Replicated Data Types (CRDTs) and a configurable conflict resolution policy. When a conflict is detected, Cosmos DB applies the chosen policy to determine which version of the document to keep. The default policy is "Last Writer Wins" (LWW).
With LWW, the system uses the _ts (timestamp) property of the document to decide which version to accept. The document with the higher _ts value is considered the latest and is propagated across all regions.
Diagnosis:
You’ll typically see errors like 409 Conflict or specific messages indicating a write conflict in your application logs or when querying the _conflicts endpoint.
To diagnose, you can query the _conflicts container, which is automatically created if your database has multi-region writes enabled.
az cosmosdb gremlin query --account-name <your_cosmosdb_account_name> --database-name <your_database_name> --query-stop-at-first-document true --query "g.V().hasLabel('cosmos-db-conflict').has('id', 'product-123')"
This query will return documents representing the conflicting versions, showing their _ts values and the differing properties.
Common Causes and Fixes:
-
Default Last Writer Wins (LWW) Insufficient for Business Logic:
- Diagnosis: You observe that the LWW policy is overwriting valid business logic changes. For instance, if one write increments a counter and another decrements it, LWW might arbitrarily pick one, losing the intended aggregate.
- Fix: Implement a custom conflict resolution policy. This is done at the database level. You can define a stored procedure that Cosmos DB will execute when a conflict is detected. The stored procedure receives the conflicting versions as input and returns the resolved version.
You then set this stored procedure as the conflict resolver for your container.// Example stored procedure for custom resolution (e.g., summing quantities) function resolve(left, right) { var resolved = left; // Assume left is the one to keep initially if (left._ts < right._ts) { resolved = right; // If right is newer, start with right } // Custom logic: If both documents have a 'quantity' field, sum them if (resolved.hasOwnProperty('quantity') && left.hasOwnProperty('quantity') && right.hasOwnProperty('quantity')) { resolved.quantity = left.quantity + right.quantity; } // Add other custom resolution logic here for different fields // Ensure _ts is set to the higher value to maintain LWW behavior if custom logic doesn't dictate otherwise resolved._ts = Math.max(left._ts, right._ts); return resolved; } - Why it works: This gives you explicit control over how conflicting updates are merged, ensuring that your specific business rules are applied even when concurrent writes occur.
-
High Concurrency on the Same Document:
- Diagnosis: The application is experiencing frequent
409 Conflicterrors, indicating many concurrent writes to the same items. - Fix: Optimize your application logic to minimize concurrent writes to the same document. This could involve:
- Batching: Grouping related updates into a single document write.
- Optimistic Concurrency Control (OCC): Using ETags. When reading a document, you get its ETag. When writing, you include the ETag. If the ETag on the server doesn’t match the one you sent, it means the document has changed, and your write will fail, allowing your application to retry or handle the conflict. Cosmos DB automatically manages ETags.
- Partition Key Design: Ensure your partition keys are well-distributed to avoid "hot partitions" where a single partition receives a disproportionate amount of traffic, increasing the likelihood of conflicts within that partition.
- Why it works: Reduces the probability of two or more clients attempting to modify the same data simultaneously. OCC explicitly signals when a conflict has occurred at the application level, giving you a chance to react before data is lost.
- Diagnosis: The application is experiencing frequent
-
Network Latency and Replication Lag:
- Diagnosis: Conflicts occur sporadically, even with seemingly low application-level concurrency. This might be due to transient network issues or periods of high replication lag between regions.
- Fix: Implement retry logic in your application with exponential backoff. When a
409 Conflicterror is received, wait for a short, increasing period before retrying the operation.// Example C# retry logic try { await container.CreateItemAsync(item); } catch (CosmosException ex) when (ex.StatusCode == HttpStatusCode.Conflict) { // Implement exponential backoff here await Task.Delay(TimeSpan.FromMilliseconds(Math.Pow(2, retryCount) * 100)); retryCount++; // Retry the operation } - Why it works: Gives the system time for replication to catch up and for the conflicting write to be resolved before retrying the original operation.
-
Incorrect Timestamp (
_ts) Handling (Rare):- Diagnosis: The
_tsvalue is being manipulated or incorrectly interpreted by the application, leading to unexpected LWW behavior. This is highly unlikely as_tsis managed by Cosmos DB. - Fix: Ensure your application code does not attempt to read, modify, or write the
_tsfield. Treat it as an internal, read-only property managed by the database. - Why it works: Prevents accidental interference with the core mechanism Cosmos DB uses for LWW conflict resolution.
- Diagnosis: The
-
Stale Reads Leading to Stale Writes:
- Diagnosis: An application reads data, performs some computation, and then writes it back. By the time the write occurs, another write might have already happened, causing a conflict. This is a classic race condition.
- Fix: Use Optimistic Concurrency Control (OCC) by checking the
_etagproperty. When you read a document, Cosmos DB returns an_etag. Include this_etagin your subsequent update operations. If the document has been modified since you read it, the_etagwill have changed, and the write will fail with a412 Precondition Failed(or sometimes a409 Conflictdepending on the exact scenario and SDK version). Your application can then re-read the latest version and re-apply its changes.// When updating, include the ETag in the request { "id": "product-123", "name": "Wireless Mouse", "quantity": 9, "_etag": "\"00000000-0000-0000-0000-000000000001\"" // Example ETag } - Why it works: OCC explicitly detects if the data you’re trying to update has been changed by another process, preventing accidental overwrites and forcing your application to re-evaluate its changes against the latest data.
-
Under-provisioned Request Units (RUs) during Peaks:
- Diagnosis: While not directly a "write conflict" in terms of data versions, severe RU throttling can manifest as write failures and retries, which can increase the likelihood of actual data conflicts occurring when operations eventually succeed.
- Fix: Monitor your RU consumption and provision enough RUs for your workload, especially during peak times. Consider using autoscale RU settings.
# Check RU consumption for a container az cosmosdb sql container show --account-name <your_account> --database-name <your_db> --name <your_container> --resource-group <your_rg> --query "resource.อัตoscale.maxThroughput" - Why it works: Ensures that your database can handle the throughput of your operations, reducing throttling and the associated retry storms that can exacerbate concurrency issues.
The next hurdle you’ll likely encounter after resolving write conflicts is understanding how to optimize query performance across multiple regions, especially when dealing with eventual consistency guarantees.