CockroachDB’s query planner doesn’t just pick the "best" plan; it actively adjusts plans based on real-time data distribution and workload, making it more of a dynamic negotiator than a static decision-maker.
Let’s see this in action. Imagine we have a users table:
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
username STRING NOT NULL UNIQUE,
email STRING NOT NULL,
signup_date TIMESTAMP DEFAULT now(),
last_login TIMESTAMP DEFAULT now(),
INDEX user_email_idx (email)
);
And we want to find a user by their email.
SELECT id, username FROM users WHERE email = 'alice@example.com';
When you run this, CockroachDB doesn’t just have a pre-baked plan. It looks at your data. How many rows are there? Is user_email_idx a good fit? Is the distribution of emails skewed? It might decide to use the index:
sql_statement: SELECT id, username FROM users WHERE email = 'alice@example.com'
execution_plan:
- USING INDEX user_email_idx
- FILTERING BY email = 'alice@example.com'
- FETCHING id, username
Now, what if alice@example.com is extremely common, and user_email_idx is massive? The planner might realize that scanning the index for that one value and then fetching from the primary key might be slower than just doing a full table scan if the table is small enough or the data distribution is really weird. It could then choose this plan:
sql_statement: SELECT id, username FROM users WHERE email = 'alice@example.com'
execution_plan:
- SCAN FROM TABLE users
- FILTERING BY email = 'alice@example.com'
- FETCHING id, username
The core problem CockroachDB’s planner solves is the inherent difficulty of choosing an optimal query execution strategy in a distributed, continuously changing database. Unlike a static database where data distribution is relatively stable, CockroachDB’s data is sharded, replicated, and can rebalance. A plan that’s good one minute might be terrible an hour later if data shifts or a new index becomes more relevant.
Here’s how it works internally: When you execute a query, the sql.DistributedPlanner takes over. It first parses the query and then enters the sql.QueryOptimizer. This optimizer has a Cost-Based Optimizer (CBO) that estimates the cost of different execution paths. It considers:
- Statistics: It uses statistics about your tables and indexes (cardinality, distinct values, null counts, data distribution histograms) to make educated guesses about how many rows an operation will return.
- Indexes: It knows about available indexes and their potential to reduce scan or join costs.
- Join Algorithms: For joins, it considers different algorithms like hash joins, merge joins, and nested loop joins, estimating their performance based on data sizes.
- Distribution: It’s aware of the distributed nature, factoring in network hops and data locality where possible.
The optimizer generates multiple candidate plans and picks the one with the lowest estimated cost. This plan is then sent to the execution engine.
The key levers you control are:
- Schema Design: Proper primary keys, secondary indexes, and data types are paramount. A well-indexed table is the planner’s best friend.
- Statistics: CockroachDB automatically collects statistics, but you can influence this.
CREATE STATISTICS s_email_dist ON users (email)can provide more granular histograms if the default isn’t capturing important distribution details.ALTER TABLE users EXPERIMENTAL_RELOCATEcan also hint at data distribution, though this is a more advanced, low-level operation. - Hints: For tricky cases, you can provide hints.
SELECT /*+ AS OF SYSTEM TIME '...' */ ...can force a specific point-in-time read, which can influence plan stability.SELECT /*+ USE_INDEX(users user_email_idx) */ ...forces the use of a particular index.SELECT /*+ ORDER BY_KEY(users id) */ ...can hint that the results are already sorted byid.
Most people don’t realize that CockroachDB’s planner is stateful within a transaction or session to some extent. If you run a query, and then immediately run another very similar query that benefits from the data already being in memory or the node having recently accessed that data, the planner might favor plans that leverage that cached state. This is subtle and not explicitly controllable via hints, but it’s why performance can sometimes feel "sticky" or improve after repeated access.
The next frontier is understanding how EXPLAIN (OPT) provides a deeper, more detailed look into the optimizer’s decision-making process, revealing exactly why a particular plan was chosen and what alternatives were considered.