ClickHouse doesn’t lock tables for ALTER TABLE or mutations, even though it’s a common assumption for databases.

Let’s see it in action. Imagine you have a table events and you want to add a new column user_id.

-- Create a sample table
CREATE TABLE events (
    timestamp DateTime,
    event_type String,
    data String
) ENGINE = MergeTree()
ORDER BY timestamp;

-- Insert some data
INSERT INTO events VALUES (now(), 'click', '{"button": "login"}'), (now() - INTERVAL 1 MINUTE, 'view', '{"page": "/home"}');

Now, while data is being inserted and queried, let’s run an ALTER TABLE statement.

-- Add a new column
ALTER TABLE events ADD COLUMN user_id UInt64;

This ALTER statement doesn’t block any ongoing INSERT operations or SELECT queries. You can run SELECT statements against events right after issuing the ALTER, and they will continue to work, potentially seeing data with or without the new user_id column depending on when the data was written.

The same applies to mutations, which are essentially background ALTER operations that modify data in place. Let’s say you want to add user_id to existing rows where event_type is 'click'.

-- Apply a mutation to add user_id to specific rows
ALTER TABLE events UPDATE user_id = randCanonical(1, 1000) WHERE event_type = 'click' SETTINGS mutations_sync = 1;

Even with mutations_sync = 1 (which waits for the mutation to complete on some parts, not all), the table remains accessible. New data written after the ALTER statement will automatically have the user_id column. Data written before will be updated by the mutation in the background.

The Mental Model: Immutable Parts and Background Processes

ClickHouse’s MergeTree family of engines fundamentally works with immutable data parts. When you INSERT data, ClickHouse writes new data parts to disk. When you ALTER TABLE (like adding a column), it doesn’t modify existing parts. Instead, it creates new parts that include the new schema. For operations like UPDATE or DELETE (mutations), ClickHouse creates new versions of the affected data parts in the background, eventually replacing the old ones.

This means that SELECT queries might read from a mix of old and new parts. The query planner is smart enough to understand the schema evolution. For a new column, older parts simply won’t have it, and SELECT will return default values (like 0 for UInt64) for those rows. For mutations, the query planner ensures consistency by reading from the latest available version of a part.

The ALTER TABLE ADD COLUMN command is almost instantaneous because it’s just a metadata change. It tells ClickHouse that future operations and queries should consider a new column. For existing data, the column will appear with default values.

Mutations (ALTER TABLE ... UPDATE/DELETE) are more complex. They operate in the background. ClickHouse identifies data parts that need modification, creates new versions of those parts with the changes applied, and then atomically replaces the old parts with the new ones. This process is managed by background threads and doesn’t block foreground operations. The SETTINGS mutations_sync = 1 option makes the ALTER command wait until at least one data part has been processed by the mutation, giving you a stronger signal that the change is in progress, but it still doesn’t lock the table.

Levers You Control

  • ALTER TABLE ADD COLUMN: This is your go-to for adding new fields. It’s fast and non-blocking.
  • ALTER TABLE MODIFY COLUMN: Changing a column’s type or default value. This also involves creating new data parts and is non-blocking.
  • ALTER TABLE ... UPDATE/DELETE: These are mutations. They are asynchronous and non-blocking by default.
  • SETTINGS mutations_sync = 1: When used with UPDATE or DELETE, this setting makes the ALTER command wait until the mutation has been applied to at least one data part. This is useful for getting an early indication that the mutation is working, but it’s not a lock.
  • ALTER TABLE ... CLEAR COLUMN IN PARTITION/RANGE: A specific type of mutation to clear data.
  • ALTER TABLE ... MATERIALIZE COLUMN: Used to compute and store default values for existing rows, making them physically present in data parts. This is also a background, non-blocking operation.

The most surprising thing about ClickHouse’s ALTER and mutation system is how it achieves high throughput and availability by leveraging immutable data parts and background processing, allowing schema changes and data modifications to happen concurrently with regular read and write traffic. This is a stark contrast to many traditional relational databases where such operations often require exclusive table locks, leading to significant downtime.

The next thing you’ll likely need to manage is the impact of these background mutations on disk space and query performance.

Want structured learning?

Take the full Clickhouse course →