CockroachDB’s crdb_internal schema is a goldmine for understanding what’s happening under the hood, but it’s not always obvious how to use it effectively.

Let’s say you’re trying to figure out why a specific query is slow, or why a certain transaction is getting stalled. You can dive into crdb_internal.cluster_transactions to see active transactions, or crdb_internal.node_liveness to check if nodes are healthy.

Here’s a quick look at crdb_internal.node_liveness in action. Imagine you run this on one of your nodes:

SELECT node_id, num_heartbeats, last_heartbeat, is_available
FROM crdb_internal.node_liveness
WHERE node_id = 1;

You might see output like this:

 node_id | num_heartbeats |      last_heartbeat       | is_available
---------+----------------+---------------------------+----------------
       1 |           1234 | 2023-10-27 10:30:00+00:00 | t
(1 row)

This tells you node 1 is alive and kicking. If is_available was f (false), that’s your first clue that something is wrong with that specific node.

Now, let’s say you’re investigating a slow query. You can use crdb_internal.exec_ செயல_stats to get detailed performance metrics for queries. This table collects statistics about query execution, including latency, rows read, and plan details.

To see the top 5 slowest queries by average latency over the last hour, you’d run:

SELECT
    query,
    avg(total_latency) AS avg_latency,
    count(*) AS execution_count
FROM
    crdb_internal.exec_ செயல_stats
WHERE
    start_time >= NOW() - INTERVAL '1 hour'
GROUP BY
    query
ORDER BY
    avg_latency DESC
LIMIT 5;

This query helps pinpoint which statements are consistently taking too long. The total_latency column is the key here, representing the total time spent executing a specific query instance.

Understanding the crdb_internal tables allows you to build a comprehensive mental model of your cluster’s behavior. For instance, crdb_internal.ranges provides information about data distribution. You can query it to see which ranges are on which nodes, their sizes, and their status.

SELECT
    r.range_id,
    r.start_key,
    r.end_key,
    r.lease_holder,
    n.address
FROM
    crdb_internal.ranges AS r
JOIN
    crdb_internal.nodes AS n ON r.lease_holder = n.node_id
WHERE
    r.database_name = 'your_database' AND r.table_name = 'your_table'
LIMIT 10;

This helps you understand data locality and potential bottlenecks related to data placement. The lease_holder column shows which node currently holds the lease for that range, making it the primary point of contact for operations on that data.

Another crucial table is crdb_internal.kv_operation_estimates. This one is a bit more advanced, showing estimates for the cost of various key-value operations. It’s less about real-time monitoring and more about understanding the potential cost of operations.

If you’re debugging issues related to transaction contention, crdb_internal.transaction_contention is your friend. It surfaces transactions that are currently blocked or have recently been blocked due to contention on locks.

SELECT
    txn_id,
    blocking_txn_id,
    lock_type,
    wait_start_time
FROM
    crdb_internal.transaction_contention
WHERE
    wait_start_time >= NOW() - INTERVAL '15 minutes'
ORDER BY
    wait_start_time DESC;

This can reveal patterns of contention, like a long-running transaction holding locks that prevent other transactions from proceeding.

The one thing most people don’t realize about crdb_internal tables is that many of them are dynamically generated or aggregated from distributed data. When you query crdb_internal.exec_ செயல_stats or crdb_internal.ranges, the information isn’t necessarily residing on the node you’re connected to; it’s being gathered from across the cluster. This means queries against these tables can themselves incur network overhead and processing, especially on very large clusters. It’s a distributed system querying a distributed system.

Once you’ve mastered crdb_internal for debugging, you’ll likely want to explore how to use the SQL API for programmatic cluster management.

Want structured learning?

Take the full Cockroachdb course →