DynamoDB Filter Expressions and Key Conditions look similar, but they operate at fundamentally different stages of a query, leading to drastically different performance and cost implications.

Let’s see this in action. Imagine you have a table MyTable with partition_key (string) and sort_key (number) as the primary key, and a global secondary index (GSI) MyGSI with gsi_partition_key (string) and gsi_sort_key (number).

Here’s how you might query it:

Scenario 1: Using Key Condition for efficient retrieval (Good!)

aws dynamodb query \
    --table-name MyTable \
    --index-name MyGSI \
    --key-condition-expression "gsi_partition_key = :pkval AND gsi_sort_key BETWEEN :start_sk AND :end_sk" \
    --expression-attribute-values '{
        ":pkval": {"S": "user#123"},
        ":start_sk": {"N": "100"},
        ":end_sk": {"N": "200"}
    }'

This query targets MyGSI. The key-condition-expression uses the GSI’s partition and sort keys. DynamoDB efficiently locates the exact items matching gsi_partition_key = 'user#123' and gsi_sort_key between 100 and 200. It only reads the data for those specific items.

Scenario 2: Using Filter Expression for inefficient filtering (Bad!)

aws dynamodb query \
    --table-name MyTable \
    --index-name MyGSI \
    --key-condition-expression "gsi_partition_key = :pkval" \
    --filter-expression "gsi_sort_key BETWEEN :start_sk AND :end_sk" \
    --expression-attribute-values '{
        ":pkval": {"S": "user#123"},
        ":start_sk": {"N": "100"},
        ":end_sk": {"N": "200"}
    }'

This query also targets MyGSI with the same partition key. However, it uses a filter-expression for the sort key range. DynamoDB first reads all items where gsi_partition_key = 'user#123' (potentially millions of items). Then, it filters down the results to only those where gsi_sort_key is between 100 and 200. You pay for reading all the intermediate items, even though you discard most of them.

The core problem this solves is how to efficiently retrieve specific data from DynamoDB. DynamoDB is a NoSQL key-value and document store. Its performance is heavily dependent on how you access data via its primary keys (partition key and sort key) or secondary indexes.

Key Conditions are the only way to target specific items before data is read from storage. When you use a key-condition-expression, DynamoDB uses its internal index structure to pinpoint the exact data blocks containing your desired items. This is extremely fast and cost-effective.

Filter Expressions, on the other hand, are applied after items are read from storage. DynamoDB reads a superset of items based on your key-condition-expression (or a scan operation), and then it discards any items that don’t match the filter-expression. This means you pay for the read capacity units (RCUs) consumed by all items read, even those you ultimately discard.

The mental model here is about the "cost boundary." Key Conditions operate before the read cost is incurred for irrelevant data. Filter Expressions operate after the read cost has already been paid for all items retrieved by the key condition.

Here are the exact levers you control:

  • key-condition-expression: Always use this for equality checks on partition keys and for range comparisons (e.g., BETWEEN, >, <, <=, >=) on sort keys, provided the sort key is part of the table or index key schema. This is your primary tool for efficient data retrieval.
  • filter-expression: Use this only for attributes that are not part of the table or index key schema, or when you absolutely cannot structure your keys to support the desired filtering. It’s a secondary refinement mechanism.
  • projection-expression: This controls which attributes are returned for the items that do match your query. It doesn’t affect which items are read (that’s Key Conditions and Filter Expressions), but it can reduce the amount of data transferred over the network and the storage cost for the returned items.

Many people think of Filter Expressions as just another way to narrow down results. The critical, often-missed mechanical detail is that Filter Expressions do not reduce the number of items DynamoDB has to read from disk to satisfy a query or scan. They only reduce the number of items that are returned to the client after they’ve already been read. This is why a query with a wide range on a sort key using a Filter Expression can be orders of magnitude more expensive than the same range using a Key Condition.

The next concept you’ll run into is optimizing scan operations, where Filter Expressions are often the only way to narrow down results, but still carry the same read cost implications.

Want structured learning?

Take the full Dynamodb course →