Cosmos DB for PostgreSQL scales down to zero for cost savings, a feat most global databases can only dream of.
Let’s see this in action. Imagine you’re running a SaaS application with users spread across the globe. Your PostgreSQL database needs to be accessible, fast, and cost-effective, no matter where your users are. This is where Cosmos DB for PostgreSQL shines.
Here’s a typical setup:
You start by creating a "coordinator" node. This is the entry point for your application’s connections and queries. For a small startup, this might be a citus-c.js-4 node with 4 vCores and 16 GiB RAM.
az postgres flexible-server create \
--resource-group myResourceGroup \
--name my-global-pg-coord \
--location eastus \
--sku citus-c.js-4 \
--tier Standard \
--storage-size 128 \
--admin-user myadmin \
--admin-password mypassword
Now, to achieve global distribution and horizontal scaling, you add "worker" nodes. These nodes store shards of your distributed tables. You can place these workers in different Azure regions to serve users locally and improve read performance. Let’s add a worker in westus:
az postgres flexible-server create \
--resource-group myResourceGroup \
--name my-global-pg-worker-1 \
--location westus \
--sku citus-c.js-4 \
--tier Standard \
--storage-size 128 \
--admin-user myadmin \
--admin-password mypassword
And another in westeurope:
az postgres flexible-server create \
--resource-group myResourceGroup \
--name my-global-pg-worker-2 \
--location westeurope \
--sku citus-c.js-4 \
--tier Standard \
--storage-size 128 \
--admin-user myadmin \
--admin-password mypassword
Once these are created, you link them to your coordinator using the citus_add_node function. Connect to your coordinator node using psql and run:
SELECT citus_add_node('my-global-pg-worker-1.postgres.database.azure.com', 5432);
SELECT citus_add_node('my-global-pg-worker-2.postgres.database.azure.com', 5432);
You’ll be prompted for the admin username and password you set during creation.
The magic happens when you create distributed tables. Instead of creating a regular PostgreSQL table, you use create_distributed_table. This function tells Citus (the extension powering Cosmos DB for PostgreSQL) to shard the table across your worker nodes. You choose a distribution column, which determines how data is spread. For a multi-tenant application, tenant_id is a natural choice.
-- Connect to the coordinator node
CREATE TABLE users (
user_id serial PRIMARY KEY,
tenant_id int,
username text,
email text
);
-- Distribute the table by tenant_id
SELECT create_distributed_table('users', 'tenant_id');
Now, when you insert data, Citus automatically routes INSERT statements to the appropriate worker node based on the tenant_id. Queries that filter or join on tenant_id can be executed directly on the worker nodes holding that tenant’s data, making them incredibly fast. For queries that involve multiple tenants, the coordinator orchestrates the distributed query execution.
The problem this solves is the classic database scaling bottleneck. Traditional single-node PostgreSQL struggles with high write loads and large datasets. Sharding manually is complex and error-prone. Cosmos DB for PostgreSQL, powered by Citus, automates sharding and distributed query processing, allowing you to scale out to hundreds of nodes.
The mental model is that of a cluster of PostgreSQL servers working together. The coordinator acts as the "brain," receiving all application requests. It then intelligently directs queries and data placement to the worker nodes, which are the "brawn" handling the bulk of storage and processing. Each worker node is a fully functional PostgreSQL instance, meaning you retain all the familiar PostgreSQL features, tools, and extensions.
The truly surprising part is how seamlessly Citus handles distributed transactions. When you run an UPDATE or DELETE statement that affects rows across multiple shards (and thus multiple worker nodes), Citus ensures atomicity. It uses a two-phase commit (2PC) protocol. The coordinator first asks all involved workers to prepare the transaction. If all workers acknowledge they are ready, the coordinator then tells them to commit. If any worker fails to prepare, the coordinator tells all others to roll back, guaranteeing consistency without manual intervention.
The next concept you’ll explore is optimizing distributed queries, particularly how to leverage "colocation" for join performance.