DynamoDB doesn’t just give you all your data at once; it hands it to you in chunks, and LastEvaluatedKey is how you ask for the next chunk.
Let’s see this in action. Imagine we have a Products table with category (partition key) and id (sort key). We want to get products for the "electronics" category, but there might be hundreds of them.
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('Products')
# First request
response = table.query(
KeyConditionExpression=boto3.dynamodb.conditions.Key('category').eq('electronics'),
Limit=10 # We'll ask for a maximum of 10 items per page
)
items = response['Items']
print(f"Got {len(items)} items.")
# Process items...
# Check if there are more items
if 'LastEvaluatedKey' in response:
last_key = response['LastEvaluatedKey']
print(f"Last evaluated key: {last_key}")
# Second request, using the LastEvaluatedKey
response_next = table.query(
KeyConditionExpression=boto3.dynamodb.conditions.Key('category').eq('electronics'),
ExclusiveStartKey=last_key, # This is the magic!
Limit=10
)
items_next = response_next['Items']
print(f"Got {len(items_next)} items in the next page.")
# Process more items...
# Check again for more pages
if 'LastEvaluatedKey' in response_next:
print(f"Next last evaluated key: {response_next['LastEvaluatedKey']}")
else:
print("No more items to retrieve.")
This ExclusiveStartKey is how DynamoDB knows exactly where to resume scanning or querying. It’s not just an offset; it’s a pointer to a specific item’s primary key. If you tried to use a simple offset, you’d run into issues if data was added or removed between your requests.
The problem LastEvaluatedKey solves is that DynamoDB operations have a 1MB throughput limit per request. You can’t just pull an entire massive table in one go. So, DynamoDB breaks results into pages. Each query or scan operation returns a maximum of 1MB of data. If the total data matching your query exceeds 1MB, the response will include a LastEvaluatedKey. This key represents the primary key of the last item that was returned in that 1MB chunk.
To get the next page of results, you make another query or scan request, but this time you include the LastEvaluatedKey from the previous response in the ExclusiveStartKey parameter. DynamoDB then starts its scan or query after the item identified by ExclusiveStartKey. This process repeats until a response no longer contains LastEvaluatedKey, indicating you’ve retrieved all matching items.
The Limit parameter controls how many items you want to receive per page, but it doesn’t guarantee you’ll get that many. You might get fewer if the 1MB data limit is reached first. Conversely, if the total number of items matching your query is less than your Limit, you’ll just get all of them, and LastEvaluatedKey won’t be present.
It’s crucial to understand that LastEvaluatedKey is specific to the request that generated it. If you change your query parameters (like adding a FilterExpression or changing the Limit), the LastEvaluatedKey from a previous, different request is not valid for the new request. You must start over from the beginning with the new parameters.
When you’re dealing with very large result sets, you’ll often see LastEvaluatedKey appear even if the number of items returned is well below your Limit. This is because the size of the data returned, not just the item count, dictates when pagination occurs. A few very large items can exhaust the 1MB limit faster than many small items.
The next hurdle you’ll face is handling potential retries and ensuring your client-side logic correctly accumulates all pages of data, especially in the face of network interruptions or transient errors.