Cosmos DB’s Bulk Executor library doesn’t actually move data in bulk; it parallelizes individual operations to feel like a bulk operation.

Let’s see it in action. Imagine you have a products collection and you want to add 10,000 new product documents.

// Sample product document
{
  "id": "prod-12345",
  "name": "Super Widget",
  "price": 19.99,
  "category": "Widgets"
}

Here’s how you’d use the Bulk Executor library in C# to insert these:

using Microsoft.Azure.Cosmos.BulkExecutor;
using Microsoft.Azure.Documents;
using System.Collections.Generic;
using System.Threading.Tasks;

// ... inside your application logic ...

var cosmosClient = new CosmosClient("YOUR_COSMOS_DB_CONNECTION_STRING");
var container = cosmosClient.GetContainer("your_database_id", "products");

var bulkOperations = new List<Task>();
var productsToInsert = new List<object>();

for (int i = 0; i < 10000; i++)
{
    productsToInsert.Add(new
    {
        id = $"prod-{i:D5}", // Example: prod-00001, prod-00002 ...
        name = $"Product #{i}",
        price = 9.99 + (i % 10), // Varies price slightly
        category = "Generic"
    });
}

var bulkOperationManager = new BulkOperationManager(container);

// Initiate the bulk insert operation
await bulkOperationManager.BulkInsertAsync(productsToInsert,
    new BulkOperationOptions
    {
        MaxConcurrencyPerPartition = 50, // Adjust based on your RUs and partition key
        MaxBatchSize = 100 // Number of operations in each batch sent to Cosmos DB
    });

Console.WriteLine("Bulk insert completed.");

The BulkOperationManager is the core component. You give it a list of documents (or a list of operations like UpsertAsync, ReplaceAsync, DeleteAsync) and it handles the complex orchestration.

The magic behind BulkInsertAsync (and other bulk operations) isn’t a single, monolithic "bulk" API call to Cosmos DB. Instead, the library takes your collection of items and breaks them down into smaller batches. For each batch, it creates a series of individual CreateItemAsync, UpsertItemAsync, etc., calls. The key is that it then executes these individual calls concurrently across multiple threads and, importantly, across multiple partitions if your data is sharded.

The MaxConcurrencyPerPartition option is critical. It dictates how many operations the library will attempt to run in parallel for a single logical partition key. If you have 10,000 documents and MaxConcurrencyPerPartition is 50, and all those documents happen to map to the same partition key, the library will still try to send up to 50 operations concurrently to that partition. This is where you can saturate your Request Units (RUs) if you’re not careful.

The MaxBatchSize determines how many individual operations are grouped together into a single request sent to Cosmos DB. Even though you’re performing 10,000 inserts, the library might be sending requests for batches of 100 items at a time, but it’s doing so across many concurrent threads and potentially many partitions.

The problem this solves is the overhead of making individual API calls for each document. For a single document, the round trip to Cosmos DB, authentication, serialization, and deserialization adds a significant per-operation cost. By parallelizing hundreds or thousands of these operations, the Bulk Executor library amortizes that overhead. You’re still paying for the RUs for each individual operation (e.g., an insert costs 5 RUs), but you’re doing it much, much faster because the network latency and processing time are overlapped.

The library automatically handles retries for throttled requests (429s) and connection issues. It also tries to distribute operations across partitions intelligently if your data has a good distribution of partition keys. If your data is heavily skewed towards a few partition keys, you’ll see much lower throughput and more throttling on those hot partitions, even with high MaxConcurrencyPerPartition settings.

What most people don’t realize is that the BulkOperationManager doesn’t have direct access to the underlying CosmosClient’s internal connection pooling or HTTP client. It essentially orchestrates calls to the Container object, which in turn uses the CosmosClient. This means that if you have multiple BulkOperationManager instances operating on the same CosmosClient, they will share the same underlying HTTP connections and thread pools. It’s generally more efficient to create a single CosmosClient and BulkOperationManager and reuse them for all your bulk operations.

The next hurdle you’ll likely face is optimizing MaxConcurrencyPerPartition based on your container’s provisioned RUs and your partition key distribution.

Want structured learning?

Take the full Cosmos-db course →