ClickHouse can appear to consume an exorbitant amount of RAM, often leading to OOM kills, but its memory management is more nuanced than a simple leak.
Let’s see it in action. Imagine we’re running a fairly standard ClickHouse setup and we start seeing OOMs. We can poke around using clickhouse-client.
SELECT
name,
value
FROM system.events
WHERE event = 'MemoryException' OR event LIKE '%MemoryLimit%' OR event LIKE '%OOM%'
ORDER BY event;
This query tells us if the server is actively complaining about memory. If we see MemoryExceptions, it’s a strong hint. Now, let’s dive into the memory consumers themselves.
SELECT
name,
current_memory_usage,
total_memory_usage
FROM system.processes
ORDER BY current_memory_usage DESC
LIMIT 10;
This gives us a snapshot of what’s using memory right now within active queries. But the real culprit is often not in system.processes because those are transient. We need to look at ClickHouse’s internal memory accounting.
SELECT
name,
value
FROM system.metrics
WHERE metric LIKE '%Memory%'
ORDER BY value DESC
LIMIT 20;
This system.metrics table is our primary tool. It shows a more persistent view of memory usage across different ClickHouse components. Here are the most common things to look for that can cause OOMs:
-
GlobalThread: This metric represents memory allocated by threads that aren’t tied to a specific query. It’s often dominated by memory used for background merges, mutations, and dictionary loading.- Diagnosis: Check
system.mergesandsystem.mutationsfor excessive activity. Look atsystem.dictionariesfor large loaded dictionaries. - Fix:
- For merges: Adjust
max_concurrent_merges_in_one_partitioninconfig.xml(orusers.xml). If it’s too high, many merges might run concurrently, each holding onto memory. Setting it to1or2can drastically reduce this. - For mutations: Consider disabling or limiting mutations if they’re not critical.
ALTER ... UPDATE/DELETEoperations can be memory-intensive. - For dictionaries: Optimize dictionary loading. If a dictionary is too large, consider reducing its scope, using a different loading strategy (e.g.,
RANGE_HASHfor smaller sets), or pre-processing it. Ensuremax_memory_usage_for_dictionariesis set appropriately inusers.xml.
- For merges: Adjust
- Why it works: Reduces the number of concurrent memory-allocating background tasks or limits the memory available to specific resource-intensive features.
- Diagnosis: Check
-
QueryThread: Memory used by query execution threads. This is often the most visible consumer during active queries.- Diagnosis: Examine
system.processesfor queries consuming large amounts ofcurrent_memory_usage. Look at the query text to understand what it’s doing (e.g., largeGROUP BY,ARRAY JOIN,ORDER BYon unindexed columns). - Fix:
- Set
max_memory_usageinusers.xmlfor specific users or globally. This is a hard limit per query. Example:max_memory_usage = 10000000000(10 GB). - Optimize queries. Avoid large aggregations without pre-aggregation, use
LIMITwhere appropriate, and ensure data is sorted forORDER BYclauses. - Increase
max_threadsif queries are CPU-bound and memory is available, allowing them to finish faster and release memory sooner. However, this can increase peak memory usage per query if not careful.
- Set
- Why it works: Enforces a hard limit on how much memory a single query can consume, preventing runaway queries from taking down the server.
- Diagnosis: Examine
-
MergeTreeData: Memory used by the MergeTree engine for caching data parts (column data, primary key indexes).- Diagnosis: High
MergeTreeDatausage can indicate that many large data parts are being actively read. Checksystem.partsfor tables with many parts or large part sizes. - Fix:
- Adjust
max_server_memory_for_merge_treeinconfig.xmlorusers.xml. This caps the total memory used by MergeTree data caches across the server. Example:max_server_memory_for_merge_tree = 50000000000(50 GB). - Review table partitioning and
merge_treesettings. Frequent small inserts can lead to many small parts, increasing cache pressure. Ensuremerge_selecting_task_max_time_to_executeis not excessively long, leading to slow merges and many parts.
- Adjust
- Why it works: Limits the total amount of RAM ClickHouse can use for caching on-disk data, forcing older, less-used data out of cache.
- Diagnosis: High
-
ZooKeeper: If ClickHouse is configured to use ZooKeeper for replication or distributed query coordination, this metric reflects memory used by the ZooKeeper client library.- Diagnosis: Check if
system.zookeepershows active connections and if theZooKeepermetric insystem.metricsis consistently high. - Fix: This is rarely the primary cause of OOMs unless the ZooKeeper connection itself is malformed or experiencing extreme churn. Ensure ZooKeeper is healthy. If it’s a consistent large consumer, it might indicate issues with replication metadata or distributed query metadata being generated excessively. Check replication settings and distributed table usage.
- Why it works: Addresses potential issues in how ClickHouse interacts with ZooKeeper, though direct tuning here is less common than for other metrics.
- Diagnosis: Check if
-
SystemMemory: This is a fallback metric and often reflects ClickHouse’s overall memory footprint that isn’t categorized elsewhere.- Diagnosis: If
SystemMemoryis high and other specific metrics are not clearly dominating, it suggests memory is being used by less obvious components. This could include internal buffers, caches for query plans, or memory used by UDFs. - Fix:
- Review
max_memory_usage(global limit) inconfig.xmlorusers.xml. This is the ultimate ceiling. - Check for custom dictionaries, UDFs, or complex
SYSTEMtable queries that might be allocating significant memory. - Ensure ClickHouse is running on an adequately sized instance. Sometimes, the server simply doesn’t have enough RAM for its intended workload.
- Review
- Why it works: Provides an overarching safety net by limiting the total memory ClickHouse can request from the OS.
- Diagnosis: If
-
Ephemeral: Memory used for temporary data structures during query execution, often related to sorting, hash tables, and intermediate results.- Diagnosis: This metric is closely tied to
QueryThreadand specific query operations. LargeGROUP BYoperations,ORDER BYon unsorted data, or complex joins can spike this. - Fix:
- Optimize queries to reduce the need for large temporary structures. For example, pre-aggregate data or ensure
ORDER BYclauses match data sorting. - Increase
max_memory_usageif the query is legitimately complex and needs more memory, but be cautious. - Consider tuning
max_block_size(though this is more about I/O efficiency) andmax_insert_block_sizeas they can indirectly affect how much data is processed in memory at once.
- Optimize queries to reduce the need for large temporary structures. For example, pre-aggregate data or ensure
- Why it works: By optimizing the query or providing more memory headroom, you allow these temporary structures to be built without exceeding system limits.
- Diagnosis: This metric is closely tied to
After applying these fixes, the next error you’re likely to encounter is a SYSTEM IS_HEALTHY message, indicating that your ClickHouse instance is now stable and no longer OOMing due to memory pressure.