BigQuery column policies are the most straightforward way to enforce data masking without touching your application code.

Let’s see this in action. Imagine you have a table my_dataset.customer_info with a credit_card_number column that you need to protect.

-- Original data (what you might see without policies)
SELECT * FROM `my_project.my_dataset.customer_info` LIMIT 10;

-- Sample Output (before masking)
-- | customer_id | name      | credit_card_number |
-- |-------------|-----------|--------------------|
-- | 123         | Alice     | 1234-5678-9012-3456|
-- | 456         | Bob       | 9876-5432-1098-7654|

Now, let’s apply a masking policy. We’ll use the MASKED_WITH_NULL function for simplicity here, but BigQuery offers others like MASKED_WITH_LAST_FOUR, MASKED_WITH_DUCKSTAMP, and custom UDFs.

First, grant the necessary permissions to the user or service account that will manage the policies. This is typically the roles/bigquery.dataOwner role, or more granularly, roles/bigquery.policyAdmin.

Then, create the policy. This is done using the ALTER TABLE statement.

ALTER TABLE my_project.my_dataset.customer_info
  ADD COLUMN POLICY credit_card_number
  AS (MASKED_WITH_NULL(credit_card_number));

After applying this policy, any query that doesn’t have the bigquery.dataViewer or bigquery.dataOwner role on the table (or project/dataset) will see the masked data.

-- Query after applying the policy, from a user without elevated roles
SELECT * FROM `my_project.my_dataset.customer_info` LIMIT 10;

-- Sample Output (after masking with MASKED_WITH_NULL)
-- | customer_id | name      | credit_card_number |
-- |-------------|-----------|--------------------|
-- | 123         | Alice     | NULL               |
-- | 456         | Bob       | NULL               |

The core problem column policies solve is the need for fine-grained access control on specific data within a table, without resorting to complex ETL processes or creating multiple views for different user groups. It decouples data protection from data consumption.

Internally, BigQuery intercepts queries. When a query targets a column with a policy, BigQuery checks the caller’s IAM permissions. If the caller doesn’t have explicit permission to bypass the policy (e.g., bigquery.dataOwner role), the masking function defined in the policy is applied to the column’s data before it’s returned to the user. This happens dynamically at query time.

The key levers you control are:

  • Column Selection: Which columns get policies.
  • Masking Function: The specific function used (e.g., MASKED_WITH_NULL, MASKED_WITH_LAST_FOUR, custom UDFs).
  • IAM Roles: Who can bypass the masking policies. The bigquery.dataViewer role allows reading unmasked data, while bigquery.dataEditor and bigquery.dataOwner also grant this bypass capability.

A common misconception is that policies are applied at the table level. They are explicitly column policies, meaning you can have different masking rules for different columns in the same table, or no rules at all for other columns. You can also have multiple policies on a single column, but only the most restrictive one will be enforced.

The real power comes when you chain masking functions or use custom UDFs. For instance, you could mask a credit card number and then apply a hashing function to it, ensuring that even if the raw data is unmasked, the masked version is still a one-way transformation.

ALTER TABLE my_project.my_dataset.customer_info
  ADD COLUMN POLICY credit_card_number
  AS (HASH(MASKED_WITH_NULL(credit_card_number))); -- Example of chaining

This allows for auditing or analysis on masked data without revealing sensitive details.

The next step you’ll likely encounter is managing exceptions for specific users or groups who do need to see the sensitive data, which involves careful IAM role management.

Want structured learning?

Take the full Bigquery course →