ClickHouse read queries can feel like they’re stuck in molasses, and the culprit isn’t always obvious.
Here’s how to dig in and find out why your SELECT statements are dragging their feet.
1. Isolate the Slow Query
First, we need to confirm which query is the problem. The ClickHouse server logs slow queries by default.
Diagnosis:
Check your ClickHouse server logs (usually /var/log/clickhouse-server/clickhouse-server.log or similar, depending on your OS and installation). Look for lines containing Query took.
Example Log Entry:
2023.10.27 10:30:00.123 [ 12345 ] <Info> <>: <Query> SELECT count() FROM my_table WHERE event_date = '2023-10-26' Query took 15.678 sec. Rows: 1000000000 Processed rows: 1000000000
This tells you the exact query and how long it took.
2. Analyze the Query Plan
Once you have the slow query, let’s see how ClickHouse plans to execute it. This is crucial for understanding where time is being spent.
Diagnosis:
Prepend EXPLAIN to your slow query.
Command:
EXPLAIN SELECT count() FROM my_table WHERE event_date = '2023-10-26';
Output Interpretation:
The EXPLAIN output will show you the stages of the query. Look for stages involving:
MergeTreeoperations (reading data from disk)Filtersteps (applyingWHEREclauses)Aggregatesteps (if you haveGROUP BYor aggregations)Joinsteps (if applicable)
Pay close attention to the estimated number of rows processed at each stage. If a stage estimates processing millions or billions of rows when you expect far fewer, that’s a red flag.
3. Check Data Skipping
ClickHouse uses various indices and data structures to skip reading unnecessary data. If this isn’t working, it will scan way more than it needs.
Common Causes & Fixes:
-
Primary Key Granularity:
- Diagnosis: Examine your table definition. The
ORDER BYclause defines the primary key. If it’s not selective enough (e.g., just a timestamp, or a low-cardinality field), ClickHouse might not be able to skip much data. - Fix: Ensure your
ORDER BY(which also serves as the primary key for data skipping) is based on columns that are frequently used inWHEREclauses and have high cardinality. For example,ORDER BY (event_date, user_id)is often better thanORDER BY event_date. - Why it works: ClickHouse uses the primary key to create sparse indices. A more selective primary key allows the index to point to smaller chunks of data, reducing disk I/O.
- Diagnosis: Examine your table definition. The
-
Secondary Indices (Skip Indexes):
- Diagnosis: If you’ve defined secondary indices (e.g.,
INDEX index_name column TYPE minmax GRANULARITY 1), check if they are being used. You can often infer this from theEXPLAINplan or by looking at thesystem.query_logtable forread_bytesvs.bytes_uncompressed. - Fix: Create appropriate skip indexes. For example, if you frequently filter by
user_idandevent_datein a table not ordered by them:ALTER TABLE my_table ADD INDEX user_id_idx user_id TYPE bloom_filter GRANULARITY 1024; ALTER TABLE my_table ADD INDEX event_date_idx event_date TYPE set(10000) GRANULARITY 1024; - Why it works: Bloom filters, min/max indices, and set indices allow ClickHouse to quickly determine if a block of data might contain relevant rows, enabling it to skip blocks that definitely do not.
GRANULARITYdetermines how often the index is updated within a data part.
- Diagnosis: If you’ve defined secondary indices (e.g.,
-
Data Partitioning:
- Diagnosis: Check your
PARTITION BYclause in theCREATE TABLEstatement. If you’re partitioning by a granularity that’s too large (e.g., partitioning by month when you query daily data), ClickHouse will scan all data for the current month instead of just a day. - Fix: Adjust partitioning. For daily data,
PARTITION BY toYYYYMMDD(event_date)is common. - Why it works: Partitioning physically separates data on disk based on the partition key. When you filter by the partition key, ClickHouse only needs to read data from the relevant partitions, drastically reducing I/O.
- Diagnosis: Check your
4. Evaluate Data Compression and Codec
Ineffective compression can lead to larger data files, meaning more data to read from disk.
Diagnosis:
Use system.columns to check the CompressionCodec for your table columns.
Command:
SELECT name, compression_codec FROM system.columns WHERE database = 'default' AND table = 'my_table';
Output Example:
┌─name──────┬─compression_codec─┐
│ event_date│ LZ4 │
│ user_id │ ZSTD(3) │
│ value │ DELTA,LZ4 │
└─────────────┴───────────────────┘
Fix:
- ZSTD: Generally offers the best compression ratio for a good balance of speed.
ZSTD(1)is fast,ZSTD(3)orZSTD(5)offer better compression at a slightly higher CPU cost. - LZ4: Very fast decompression, but lower compression ratio. Good for columns that are read very frequently and where CPU is a bottleneck.
- Delta/DoubleDelta: Excellent for sequential or nearly sequential data (like timestamps or counters). Often combined with LZ4 or ZSTD.
If you see NONE or just LZ4 on columns that benefit from better compression (like numeric IDs or timestamps), consider changing it:
ALTER TABLE my_table MODIFY COLUMN user_id ZSTD(3);
ALTER TABLE my_table MODIFY COLUMN event_date CODEC(ZSTD(1), LZ4); -- Apply multiple codecs
Why it works: Better compression means smaller files on disk. While decompression adds CPU overhead, it often saves significantly more time by reducing the amount of data that needs to be read from disk (which is typically much slower).
5. Examine Data Granularity (GRANULARITY)
The GRANULARITY setting in MergeTree tables dictates how many rows are grouped together for the primary index.
Diagnosis:
Check your CREATE TABLE statement for the GRANULARITY parameter. Default is 8192.
Fix:
- Too High: If
GRANULARITYis too high (e.g.,65536), the primary index becomes too sparse, and ClickHouse has to read more data per index entry. - Too Low: If
GRANULARITYis too low (e.g.,128), index files become very large, increasing memory usage and potentially slowing down index lookups.
For most analytical workloads, the default GRANULARITY = 8192 is a good starting point. If your queries are still slow and EXPLAIN shows large row counts being scanned, consider decreasing it slightly (e.g., to 4096).
-- Note: Changing GRANULARITY requires rebuilding the table or parts.
-- This is a more advanced operation and usually done during table creation or via ALTER TABLE ... MODIFY ... which may rewrite data.
-- For existing tables, consider creating a new table with the desired granularity and migrating data.
Why it works: GRANULARITY balances index size and data skipping efficiency. A lower granularity means more index entries, but each entry covers fewer rows, allowing for more precise data skipping.
6. Check for Unnecessary Data Loading or Joins
Sometimes, the slowness isn’t in reading from the main table, but in how data is brought in or combined.
Diagnosis:
- Subqueries/CTEs: Analyze your
EXPLAINplan for subqueries or Common Table Expressions (CTEs). If they’re materialized and large, they could be a bottleneck. - Joins: If your query involves
JOINs, check the join keys and the size of the tables being joined. Unoptimized joins, especially with large right-hand tables, are notorious performance killers. INvs.JOIN: For large lists inWHERE col IN (list), consider if aJOINwith a temporary table or a pre-aggregated table would be more efficient.
Fix:
- Subqueries: Rewrite subqueries to be more efficient, or materialize them differently if possible.
- Joins: Ensure join keys are indexed and have matching data types. Use
GLOBAL JOINorANY JOINwhere appropriate. Consider denormalizing your schema if joins are consistently problematic. INclause: If the list inINis large and static, consider creating a temporary table and joining against it.
Why it works: ClickHouse is optimized for analytical queries. Complex procedural logic or inefficient joins can bypass its strengths, leading to performance degradation.
7. Query Profiling
For very complex or elusive performance issues, dive into the query profile.
Diagnosis:
Run your query with FORMAT PROFILEEvents.
Command:
SELECT count() FROM my_table WHERE event_date = '2023-10-26' FORMAT PROFILEEvents;
Output Interpretation:
This provides a detailed breakdown of events during query execution. Look for:
ReadBuffer_(I/O operations)Execute(CPU time spent in various functions)Network(if applicable)UncompressedCache(if enabled and being hit/missed)
Fix:
This output often points back to the previous issues. For example, high ReadBuffer_ counts suggest I/O bottlenecks (likely due to poor data skipping or compression). High Execute counts in specific functions might indicate inefficient UDFs or complex aggregations.
Why it works: The profile provides a granular, event-by-event view of what the query is doing, allowing you to pinpoint the exact operations consuming the most resources.
If you’ve addressed data skipping, compression, and partitioning, and your queries are still slow, the next bottleneck you’ll likely encounter is CPU saturation from complex aggregations or data transformations.