The most surprising thing about ClickHouse’s AggregatingMergeTree is that it doesn’t actually precompute aggregates at insert time; it precomputes intermediate aggregate states.
Let’s see this in action. Imagine we have a table to track website visits and want to count unique visitors per day.
CREATE TABLE website_visits (
visit_date Date,
user_id UInt64,
-- Other columns...
) ENGINE = AggregatingMergeTree()
ORDER BY visit_date;
Now, we insert some data. Each user_id will be processed by an aggregate function. We’ll use uniq(user_id) to count distinct users.
INSERT INTO website_visits (visit_date, user_id) VALUES
('2023-10-26', 101),
('2023-10-26', 102),
('2023-10-26', 101), -- Duplicate user
('2023-10-27', 103),
('2023-10-27', 101);
When you query this table, you don’t directly query the raw user_ids. Instead, you use a special argMax (or similar) function to finalize the aggregate state.
SELECT
visit_date,
uniqState(user_id) AS unique_users_state -- This is the intermediate state
FROM website_visits
GROUP BY visit_date;
This query returns unique_users_state, which is a binary blob representing the partial aggregation. To get the final count, you need another step:
SELECT
visit_date,
uniqMerge(unique_users_state) AS unique_users_count -- Finalize the state
FROM (
SELECT
visit_date,
uniqState(user_id) AS unique_users_state
FROM website_visits
GROUP BY visit_date
)
GROUP BY visit_date;
This returns 2 for '2023-10-26' (users 101, 102) and 2 for '2023-10-27' (users 103, 101). The magic is that uniqState doesn’t store 101, 102, 103. It stores a compact representation that allows merging and finalization.
The core problem AggregatingMergeTree solves is speeding up aggregation queries, especially on large datasets with repeated aggregations. Traditional GROUP BY queries on raw data require scanning all relevant rows and performing the aggregation from scratch every time. With AggregatingMergeTree, the aggregation process is split into two phases: creating intermediate states during inserts and merging/finalizing these states during queries. This drastically reduces the amount of data that needs to be processed at query time.
Internally, each row inserted into an AggregatingMergeTree table doesn’t store the raw values for the aggregated columns. Instead, it stores the result of applying an aggregate function that returns a state (like uniqState, sumState, countState, avgState, groupArrayState). These state-returning functions produce a compact, often binary, representation of the aggregation progress.
When data is inserted, ClickHouse processes each row and updates the aggregate state for that part of the data. When data is merged (which happens in the background for MergeTree tables), the aggregate states from different data parts are merged using corresponding merge functions (e.g., uniqMerge, sumMerge, countMerge, avgMerge, groupArrayMerge). These merge functions are designed to combine two intermediate states into a new, combined state efficiently.
The query phase then reads these pre-merged aggregate states. Instead of processing raw data, it applies a finalization function (often the same function as the state-returning one, but without the State suffix, e.g., uniq, sum, count, avg) to the merged states to produce the final result. This is significantly faster because the expensive part of the aggregation (processing individual data points) has already been done and summarized into states.
The ORDER BY clause in AggregatingMergeTree is crucial. It determines how data parts are sorted and merged, which directly impacts the efficiency of merging the aggregate states. A good ORDER BY key will colocate data that is likely to be aggregated together, allowing for more effective state merging.
The key lever you control is the choice of aggregate functions. You must use the State variants for insertion and the corresponding Merge or finalization functions for querying. The performance gain comes from the fact that merging states is usually much faster than re-aggregating raw data. For example, merging two uniqState objects is orders of magnitude faster than scanning all unique user IDs from two sets of raw data.
When you define an AggregatingMergeTree table, the columns intended for aggregation don’t store raw data but rather the intermediate aggregate states. This means that a query like SELECT uniq(user_id) FROM my_agg_table will not work directly if user_id is supposed to be aggregated. Instead, you must explicitly use the State function during insertion (or have it implicitly applied by how you structure your data ingestion) and then use the Merge or finalization function during querying. The ORDER BY clause on the table definition is critical for performance; it dictates how data parts are sorted and merged, which directly impacts the efficiency of merging the aggregate states.
The next concept you’ll run into is how to handle multiple aggregate functions on the same set of data within an AggregatingMergeTree table, which often involves using Nested data structures or multiple aggregate columns.