Managed Databases on DigitalOcean can scale reads by adding read replicas, which are essentially copies of your primary database that handle read-only traffic.

Let’s see this in action. Imagine you have a PostgreSQL database, my-app-db, in a DigitalOcean project. Initially, it’s just the primary node.

doctl dba list --format ID,NAME,ENGINE,VERSION,NODE_COUNT,STATUS

Output:

2023-09-16T10:00:00Z	my-app-db	PostgreSQL	14.5	1	active

Now, we want to add a read replica. We’ll use the doctl command-line tool.

doctl dba read-replica create my-app-db-replica --database-cluster-id 2023-09-16T10:00:00Z --node-plan do-db-s-1vcpu-2gb

After a few minutes, let’s check the status again.

doctl dba list --format ID,NAME,ENGINE,VERSION,NODE_COUNT,STATUS

Output:

2023-09-16T10:00:00Z	my-app-db	PostgreSQL	14.5	2	active
2023-09-16T10:05:00Z	my-app-db-replica	PostgreSQL	14.5	1	active

Notice NODE_COUNT is now 2 for my-app-db. The my-app-db-replica is a separate managed database resource, but it’s linked to the primary. You’ll get a new connection string for this replica. Your application code then needs to be modified to direct read queries to this replica’s connection string and write queries to the primary’s connection string.

The core problem read replicas solve is the I/O bottleneck on a single primary database node when read traffic significantly outweighs write traffic. By offloading read operations to dedicated replica nodes, the primary node is freed up to handle writes more efficiently, and the overall read throughput of your application increases dramatically. Each replica is a fully functional, albeit read-only, instance of your database, synchronized with the primary.

The synchronization mechanism for PostgreSQL is typically asynchronous streaming replication. The primary database continuously sends its Write-Ahead Log (WAL) segments to the replica. The replica then applies these changes to its own data files. This means there’s a small replication lag, a delay between a write occurring on the primary and that write being visible on the replica. You can monitor this lag within the DigitalOcean control panel or via SQL queries specific to your database engine (e.g., pg_stat_replication in PostgreSQL).

When you create a read replica, DigitalOcean provisions a new, separate database cluster for it. This replica cluster is then configured to connect to your primary database cluster and begin the replication process. The NODE_COUNT for the primary cluster increases to reflect that it’s now managing replication for at least one replica. However, the replica itself is listed as a separate database cluster in your account, with its own connection string, node plan, and resource allocation. You pay for each replica as a separate managed database instance.

The doctl dba read-replica create command requires the database-cluster-id of the primary you want to replicate from. You also specify the node-plan for the replica, which can be different from the primary’s plan, allowing you to tailor resources. For example, you might have a smaller, cheaper node plan for read replicas if your read queries are less resource-intensive than your writes.

To manage read/write splitting in your application, you typically use a connection pooler or a proxy layer. For example, you might configure your application to always send SELECT statements to the replica’s connection string and INSERT, UPDATE, DELETE statements to the primary’s connection string. More advanced solutions involve application-level logic that inspects queries or uses libraries designed for read-write splitting.

A common misconception is that read replicas automatically balance load. They don’t. You are responsible for directing traffic to them. If you don’t explicitly send read queries to the replica’s connection string, the primary will continue to handle all traffic, and the replica will just sit idle, consuming resources.

The next logical step after adding read replicas is implementing robust read/write splitting in your application and monitoring replication lag to ensure data consistency.

Want structured learning?

Take the full Digitalocean course →