CockroachDB Cloud’s pricing isn’t a simple "pay for what you use" model; you’re paying for provisioned capacity, which can lead to significant overspending if not managed actively.

Let’s look at a typical CockroachDB Cloud cluster and how we can tune it.

{
  "name": "my-prod-cluster",
  "region": "us-east-1",
  "cloud_provider": "aws",
  "plan": {
    "name": "standard",
    "nodes": 3,
    "instance_type": "m5.xlarge",
    "disk_size_gb": 100
  },
  "num_replicas": 3,
  "created_at": "2023-10-27T10:00:00Z",
  "state": "ACTIVE"
}

This standard plan with 3 m5.xlarge instances and 100GB of disk per node might be overkill, especially if your workload fluctuates. The key is understanding that you’re paying for the potential compute and storage, not necessarily the average or peak usage.

Right-Sizing Your Instances

The most impactful cost-saving measure is often choosing the right instance type. CockroachDB is designed to scale horizontally, meaning more nodes are often better than fewer, larger nodes. However, the size of those nodes matters.

Diagnosis: Monitor your cluster’s resource utilization through the CockroachDB Cloud console or by querying crdb_internal.node_runtime_statistics. Look for metrics like cpu_usage_cores and memory_bytes_used. If your average CPU utilization is consistently below 50% and memory usage is well within limits across all nodes, you’re likely over-provisioned.

Fix: If you have m5.xlarge (4 vCPU, 16 GiB RAM) nodes and observe low utilization, consider downsizing. For many workloads, m5.large (2 vCPU, 8 GiB RAM) or even m5.medium (1 vCPU, 4 GiB RAM) might suffice. For example, changing from m5.xlarge to m5.large for a 3-node cluster can reduce compute costs by roughly 50% per node. The specific instance types and their performance characteristics depend on your cloud provider (AWS m5 series, GCP n2-standard, Azure Dsv3 series, etc.).

Why it works: You’re directly reducing the hourly cost associated with each running instance by selecting a smaller, less resource-intensive (and therefore cheaper) virtual machine. CockroachDB’s distributed nature means it can efficiently utilize resources across multiple smaller nodes.

Optimizing Disk Size

Disk costs can also add up, especially if you’re over-allocating. CockroachDB Cloud typically uses provisioned IOPS and storage, so having more disk than you need incurs unnecessary charges.

Diagnosis: Check crdb_internal.node_disk_statistics or the CockroachDB Cloud console for actual disk usage per node. Compare this to your disk_size_gb setting.

Fix: If a 100GB disk per node is consistently underutilized (e.g., only 20GB used), you can reduce it. For instance, decreasing from 100GB to 50GB per node for a 3-node cluster might save you 3 * 50GB * (cost per GB/month). The exact cost savings depend on your cloud provider’s EBS, persistent disk, or managed disk pricing.

Why it works: You’re paying for the allocated storage capacity. By reducing this allocation to match your actual needs, you eliminate the charges for unused disk space.

Implementing Node Scheduling

Not all workloads require 24/7 availability. For development, staging, or even some less critical production environments, you can significantly cut costs by scheduling your cluster to run only when needed.

Diagnosis: Analyze your cluster’s access patterns. Are there predictable periods of inactivity (e.g., nights, weekends, specific business hours)?

Fix: Utilize CockroachDB Cloud’s built-in scheduling feature. You can configure your cluster to automatically stop at, say, 6 PM on weekdays and start at 8 AM, and remain stopped on weekends. This can lead to savings of 50-70% on compute costs during the downtime periods. You can configure this via the CockroachDB Cloud UI under "Cluster Settings" -> "Scheduling."

Why it works: When a cluster is stopped, you are no longer billed for the compute resources (CPU, RAM) of the nodes. You typically still incur minimal charges for the provisioned storage, but the primary driver of cost (compute) is eliminated.

Leveraging Serverless

For workloads with highly unpredictable or spiky traffic, or for applications that don’t require a persistent, always-on database, CockroachDB Serverless can be a more cost-effective option.

Diagnosis: Your workload experiences extreme peaks and valleys, making it hard to predict provisioned capacity needs, or the database is only used intermittently.

Fix: Migrate to a CockroachDB Serverless instance. Serverless clusters automatically scale compute up and down based on demand and you’re billed per request and for the storage used, rather than for provisioned instances. This can be dramatically cheaper for low-to-moderate, bursty, or intermittent usage patterns.

Why it works: Serverless abstracts away instance management and billing. You pay for actual usage (requests, storage) rather than idle provisioned capacity, aligning costs directly with demand.

Understanding Replication Factor and Node Count

While 3 replicas (a replication factor of 3) is the default and recommended for most production workloads for high availability, you might have specific scenarios where fewer replicas are acceptable for non-critical environments.

Diagnosis: You have a cluster with a high replication factor (e.g., 5) and many nodes, but the availability requirements are not extremely stringent, or you’re using it for development/testing.

Fix: Consider reducing the number of nodes and/or replicas if your availability requirements allow. For a development cluster, a single-node cluster with 1 replica might be sufficient, reducing costs by ~80-90% compared to a 3-node, 3-replica setup. For production, reducing from 5 nodes to 3 nodes (keeping replication factor 3) will reduce compute costs by 40%.

Why it works: Each node and each replica adds to the compute and storage costs. Reducing these directly lowers your bill. However, this comes at the expense of fault tolerance.

Reviewing Storage IOPS

Beyond just disk size, some cloud providers charge for provisioned IOPS (Input/Output Operations Per Second). If your workload is not I/O intensive, you might be overpaying.

Diagnosis: Examine your cloud provider’s disk configuration and CockroachDB Cloud’s performance metrics. If your disk I/O utilization (IOPS used vs. provisioned) is consistently low, you’re likely paying for more than you need.

Fix: If possible, adjust your storage to a configuration with lower provisioned IOPS or a magnetic/standard tier if your provider offers it and your workload permits. For example, if your provider charges $0.10 per provisioned IOPS per month and you have 1000 provisioned IOPS but only use 100, reducing to 200 provisioned IOPS could save $80 per month per disk.

Why it works: You’re decoupling the cost from the raw performance capability of the disk and aligning it with your actual I/O needs.

By actively managing these aspects, you can ensure your CockroachDB Cloud spend is directly tied to your actual usage and performance requirements, rather than paying for unused capacity.

Want structured learning?

Take the full Cockroachdb course →