The most surprising truth about Cosmos DB RU provisioning is that "container" isn’t just a smaller scope than "database"; it’s fundamentally a different unit of scale and isolation.

Let’s see this in action. Imagine we’re setting up a new application that needs to store user profiles. We’ll start with a database named userdb and a container within it called profiles.

Here’s how we’d provision throughput at the database level, giving the entire userdb a shared pool of 400 RU/s:

{
    "id": "userdb",
    "throughput": 400
}

Now, if we wanted to provision at the container level, we’d remove the throughput from the database and assign it to the profiles container:

{
    "id": "userdb"
}

And then for the container:

{
    "id": "profiles",
    "throughput": 400
}

The key difference here is how those 400 RU/s are managed. With database-level provisioning, all containers within userdb share that 400 RU/s pool. If we had another container, say settings, also within userdb, both profiles and settings would contend for the same 400 RU/s. If profiles suddenly needed 350 RU/s and settings needed 100 RU/s, settings would likely experience throttling because the total request for the database (450 RU/s) exceeds the provisioned capacity (400 RU/s).

With container-level provisioning, each container gets its own dedicated slice of throughput. If we provisioned profiles with 400 RU/s and settings with 100 RU/s, profiles has its guaranteed 400 RU/s, and settings has its guaranteed 100 RU/s. They don’t contend with each other for requests. This isolation is crucial for predictable performance, especially when you have workloads with vastly different performance needs or when you want to prevent "noisy neighbors" from impacting critical operations.

This choice impacts cost, performance predictability, and management complexity. Database-level provisioning is simpler to manage if you have a few containers with similar, predictable throughput needs. You set it once at the database level and forget it. However, it’s prone to throttling if one container’s demand spikes and starves others. Container-level provisioning offers granular control and isolation, ensuring that each container gets its guaranteed performance, but it can lead to over-provisioning if not carefully monitored, as you might provision for peak capacity on each container independently.

The critical insight for many is that even with container-level provisioning, if you don’t explicitly set throughput at the container level, it defaults to inheriting from the database. This means you can have a database with throughput provisioned, and all its containers will share that throughput unless you specifically override it for a given container. This inheritance mechanism is often overlooked when debugging unexpected throttling on a specific container within a database that appears to have sufficient total throughput.

Ultimately, the decision hinges on your application’s architecture and performance requirements. For predictable, high-performance workloads or when strict isolation is paramount, container-level provisioning is the way to go. For simpler applications with uniform needs, database-level provisioning offers a more streamlined approach.

The next logical step is understanding how to leverage autoscale provisioning at either the database or container level.

Want structured learning?

Take the full Cosmos-db course →