BigQuery tables can be cloned and snapshotted for zero-cost backups, but the "zero-cost" part is a bit of a marketing spin; you’re not paying for storage, but you are paying for the metadata operations and the querying to access those backups.

Let’s see it in action. Imagine you have a table called my_dataset.my_table containing critical sales data.

-- Create a table for demonstration
CREATE OR REPLACE TABLE my_dataset.my_table (
    sale_id INT64,
    product STRING,
    amount NUMERIC,
    sale_timestamp TIMESTAMP
);

-- Insert some data
INSERT INTO my_dataset.my_table VALUES
(1, 'Widget A', 10.50, CURRENT_TIMESTAMP()),
(2, 'Gadget B', 25.00, CURRENT_TIMESTAMP());

-- Now, let's create a "snapshot" using a time travel query
SELECT * FROM `my_project.my_dataset.my_table` FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR);

This query, without any WHERE clause, would retrieve all data from one hour ago. BigQuery’s time travel feature allows you to query data as it existed at any point within the last 7 days (or your configured data retention period). This is the foundation of your "zero-cost" backup.

But what if you need to restore this data or keep it for longer than 7 days? That’s where cloning comes in.

-- Create a clone of the table as it exists *right now*
CREATE OR REPLACE TABLE my_dataset.my_table_clone AS
SELECT * FROM my_dataset.my_table;

-- Or, create a clone of the table as it existed an hour ago
CREATE OR REPLACE TABLE my_dataset.my_table_snapshot_clone
CLONE my_dataset.my_table
FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR);

When you CLONE a table, BigQuery creates a new table that points to the same underlying data as the original table at the specified point in time. It doesn’t copy the data itself. This is why it’s "zero-cost" in terms of storage for the data itself. The new table has its own metadata, schema, and access controls, but it shares the physical data blocks with the source.

The Mental Model:

Think of BigQuery’s storage as a series of immutable, versioned data blocks. When you update or delete data, you’re not modifying existing blocks; you’re creating new ones and updating pointers. Time travel works by keeping older versions of these pointers and the associated data blocks accessible for a configurable duration (defaulting to 7 days).

Cloning takes this a step further. When you clone a table A to create table B at time T, table B essentially gets a set of pointers to the data blocks that constituted table A at time T. If table A is later modified or deleted, table B remains unaffected because its pointers are independent. This makes cloning an excellent mechanism for creating point-in-time backups that persist independently of the original table’s lifecycle.

You can also clone tables across different datasets or even projects, which is useful for disaster recovery or creating isolated copies for testing.

-- Clone to a different dataset in the same project
CREATE OR REPLACE TABLE my_dataset_backup.my_table_backup_20231027
CLONE my_project.my_dataset.my_table;

-- Clone to a different project (requires appropriate IAM permissions)
CREATE OR REPLACE TABLE `another-project.backup_dataset.my_table_backup`
CLONE `my-project.my_dataset.my_table`;

The "zero-cost" aspect applies primarily to the data storage. You aren’t duplicating petabytes of data. However, there are still costs associated with BigQuery:

  1. Metadata Operations: Each CLONE operation incurs a small metadata cost. This is usually negligible unless you’re cloning thousands of tables hourly.
  2. Querying Cloned Tables: When you query a cloned table, you are charged for the data scanned, just like any other BigQuery table. If your backup is 1TB and you query 500GB of it, you pay for that 500GB scan.
  3. Storage for Clones (after time travel): While a clone initially points to shared data, if the original table’s data blocks are garbage collected (after the time travel window expires and the data is no longer referenced by any active table or clone), BigQuery might need to materialize the data for the clone. This is less common than you might think, as clones themselves act as references that keep data alive. However, if you clone a table and then drop the original, the cloned data blocks are now solely owned by the clone and will incur standard storage costs. A true "zero-cost" backup strategy relies on the fact that the clone itself is keeping the data alive, and you’re not paying extra for storage beyond what’s needed to satisfy the clone’s existence.

The most surprising true thing about this is that BigQuery’s "zero-cost" backup strategy using cloning is fundamentally about managing data block references rather than copying data. A cloned table doesn’t consume additional storage for the data itself until the original data blocks are no longer referenced by any active table or clone, at which point the clone’s references ensure those blocks are retained, and then you start paying standard storage rates for the data that the clone exclusively relies on.

The next logical step after establishing a cloning strategy is to automate it using scheduled queries or Cloud Functions.

Want structured learning?

Take the full Bigquery course →