ClickHouse doesn’t actually choose a join strategy at query time; you have to tell it which one to use, and if you don’t, it’ll pick the worst one.

Let’s see what happens when you get a join wrong. Imagine we have two tables: users and orders.

-- users table
CREATE TABLE users (
    user_id UInt64,
    name String
) ENGINE = MergeTree()
ORDER BY user_id;

-- orders table
CREATE TABLE orders (
    order_id UInt64,
    user_id UInt64,
    amount Float64
) ENGINE = MergeTree()
ORDER BY order_id;

Now, let’s try to join them:

SELECT
    u.name,
    o.amount
FROM users AS u
JOIN orders AS o ON u.user_id = o.user_id;

If users has 1 billion rows and orders has 10 billion rows, and you’re running this on a single node, this query will likely grind to a halt or take an extremely long time. The default join strategy, a Cartesian join (or a full nested loop join if there’s an equality condition), would attempt to compare every row in users with every row in orders. That’s 10^18 comparisons. Yeah, no.

ClickHouse has several join strategies, each optimized for different scenarios:

  1. Hash Join (default for equality conditions): This is generally the fastest for large tables.

    • How it works: It builds a hash table on the smaller table (or the table on the "left" side of the join if sizes are similar). Then, it probes this hash table with rows from the larger table.
    • When to use: When joining on equality (=) between two tables, and at least one of them fits into memory.
    • Configuration: ClickHouse automatically attempts to use a hash join if the smaller table can fit into memory. You can hint at it with SETTINGS join_algorithm='hash'.
  2. Merge Join: This is efficient when both tables are already sorted on the join key.

    • How it works: It iterates through both sorted tables simultaneously, merging them based on the join condition. It’s like a merge sort but for joins.
    • When to use: When both tables are sorted by the join key (e.g., they are MergeTree tables and you’re joining on the ORDER BY key, or you’ve explicitly sorted them).
    • Configuration: You can force this with SETTINGS join_algorithm='merge'.
  3. Nested Loop Join (Sequential Scan): This is the slowest and should be avoided for large tables.

    • How it works: For each row in the outer table, it scans the entire inner table to find matching rows.
    • When to use: Only for very small inner tables (e.g., a dimension table with a few dozen rows) or when the join condition is not an equality (e.g., >, <, BETWEEN).
    • Configuration: This is what happens when hash join memory limits are exceeded or when the join condition isn’t suitable for other algorithms.

The key to optimizing joins in ClickHouse is understanding your data and your table structures.

Let’s revisit our users and orders example. If users is significantly smaller than orders, ClickHouse will likely default to a hash join, building a hash table on users.

-- Example: If users table is small enough to fit in memory (e.g., 1 million users)
SELECT
    u.name,
    o.amount
FROM users AS u
INNER JOIN orders AS o ON u.user_id = o.user_id
SETTINGS join_algorithm='hash'; -- Explicitly stating, but often automatic

Here, ClickHouse will read users, build a hash table of user_id -> name in memory. Then, for each row in orders, it calculates the hash of order.user_id and probes the hash table. If a match is found, it returns the joined row. This is O(N + M) complexity, where N is the size of users and M is the size of orders, assuming the hash table fits in memory.

Now, consider if both users and orders were sorted by user_id. This is less common for fact tables like orders but possible.

-- Assume both tables are sorted by user_id
-- (e.g., if users was also ordered by user_id and orders was partitioned/sorted by user_id)
SELECT
    u.name,
    o.amount
FROM users AS u
INNER JOIN orders AS o ON u.user_id = o.user_id
SETTINGS join_algorithm='merge';

In this merge join scenario, ClickHouse would read users and orders concurrently. It would maintain pointers to the current row in each table. If users.user_id == orders.user_id, it outputs the joined row and advances both pointers. If users.user_id < orders.user_id, it advances the users pointer. If users.user_id > orders.user_id, it advances the orders pointer. This is also very efficient, O(N + M), but requires sorted input.

The crucial point is that the JOIN keyword itself doesn’t dictate the strategy; ClickHouse’s query planner tries to pick the best one based on table sizes, available memory, and join conditions. However, if the planner makes a bad guess (e.g., trying to hash-join two tables that don’t fit in memory, or resorting to nested loops when a hash join would have been feasible), performance plummets.

The join_algorithm setting is your primary lever. You can also influence it with join_use_null (whether to keep rows from the left table if no match is found on the right, impacting strategy selection) and max_block_size.

A common pitfall is performing a JOIN where one table is enormous and the other is tiny, but ClickHouse doesn’t have enough memory to build the hash table for the tiny table. In this case, it might fall back to a sequential scan, which is disastrous.

-- Example of a situation where hash join might fail if memory is insufficient
SET max_memory_usage = 1000000000; -- 1GB limit
SET join_algorithm = 'auto'; -- Default

SELECT
    big_table.id,
    small_table.value
FROM big_table
JOIN small_table ON big_table.key = small_table.key;

If small_table is still too large for the max_memory_usage allocated to the query to build its hash table, ClickHouse might switch to a less optimal algorithm. You might need to increase max_memory_usage or ensure small_table is truly small enough.

The most surprising thing about ClickHouse joins is how aggressively it tries to fit the smaller table into memory for a hash join. If it can’t, its fallback behavior can be surprisingly bad, leading to performance that’s orders of magnitude worse than expected. It’s not just about picking the right algorithm; it’s about ensuring the conditions for the best algorithm (usually hash join) are met.

The next rabbit hole you’ll fall down is understanding distributed joins and how join_algorithm behaves differently across shards.

Want structured learning?

Take the full Clickhouse course →