BigQuery’s Time Travel feature lets you query data as it existed at a specific point in the past, up to seven days ago.
Imagine you’ve just committed a bad UPDATE statement that wiped out half your customer records. Or perhaps you need to audit changes made to a critical table over the last few days. Instead of restoring from a backup (which would be hours of downtime and data loss), you can use Time Travel.
Let’s say you have a table named my_dataset.customer_data. You want to see what the data looked like at 10:00 AM yesterday.
SELECT *
FROM `my_dataset.customer_data`
FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)
LIMIT 10;
This query will return rows from customer_data as they existed one hour ago. The FOR SYSTEM_TIME AS OF clause is the key. You can provide a specific timestamp or use relative time functions like CURRENT_TIMESTAMP() with intervals.
Here’s a more precise example, querying data from exactly 24 hours ago:
SELECT COUNT(*)
FROM `my_dataset.customer_data`
FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR);
You can also specify a historical timestamp directly:
SELECT *
FROM `my_dataset.customer_data`
FOR SYSTEM_TIME AS OF '2023-10-27 10:00:00 UTC'
LIMIT 5;
The data is available for up to seven days. Beyond that, it’s purged.
BigQuery stores historical data by default for all tables. There’s no special configuration needed to enable Time Travel itself. The underlying mechanism involves BigQuery’s storage architecture, which keeps multiple versions of data blocks. When you query with FOR SYSTEM_TIME AS OF, BigQuery intelligently retrieves the appropriate historical versions of these blocks to reconstruct the table’s state at that past moment. This is all managed transparently; you don’t interact with these blocks directly.
The exact timestamp you provide is crucial. BigQuery uses UTC by default unless you specify a timezone. If you’re dealing with time-sensitive data, ensuring your TIMESTAMP values are correctly interpreted is paramount. For instance, if your application logs events in America/New_York and you want to query the state at 10 AM EST yesterday, you’d need to convert that to UTC.
SELECT *
FROM `my_dataset.customer_data`
FOR SYSTEM_TIME AS OF TIMESTAMP('2023-10-26 15:00:00', 'UTC') -- Assuming 10 AM EST is 3 PM UTC
LIMIT 5;
The cost of querying historical data is the same as querying current data. You pay for the bytes scanned. However, since Time Travel queries are effectively scanning older versions of data, the underlying storage for those historical versions is maintained by BigQuery for the seven-day window, without incurring additional explicit storage costs beyond what’s already accounted for by BigQuery’s general storage. The real "cost" is the potential for misunderstanding the exact historical point you’re querying, leading to incorrect analysis or recovery.
One thing that trips people up is the difference between FOR SYSTEM_TIME AS OF and FOR SYSTEM_TIME BETWEEN. The former gives you a snapshot, while the latter allows you to query data that existed during a time interval, which is useful for understanding the duration a record existed or when it was modified. For example, to see all records that were present at any point between 9 AM and 10 AM yesterday:
SELECT *
FROM `my_dataset.customer_data`
FOR SYSTEM_TIME BETWEEN
TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 25 HOUR) -- 1 hour before the 1-hour window
AND TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) -- End of the 1-hour window
LIMIT 10;
This BETWEEN clause is powerful for change tracking, allowing you to see rows that were inserted, updated, or deleted within a specific window.
You’ll next want to explore how to use this for automated data recovery scripts or to build audit trails.