DynamoDB’s Global Secondary Indexes (GSIs) are typically designed to query a single entity type, but you can absolutely pack multiple entity types into one GSI to dramatically reduce costs and simplify your access patterns.
Let’s see this in action. Imagine we have two entity types: User and Order. A User has a userId and name. An Order has an orderId, userId, and status.
Here’s a simplified users table schema:
| Attribute Name | Type | Description |
|---|---|---|
pk |
String | Partition key (e.g., USER#<userId>) |
sk |
String | Sort key (e.g., METADATA) |
userId |
String | User’s unique identifier |
name |
String | User’s name |
And a orders table schema:
| Attribute Name | Type | Description |
|---|---|---|
pk |
String | Partition key (e.g., USER#<userId>) |
sk |
String | Sort key (e.g., ORDER#<orderId>) |
orderId |
String | Order’s unique identifier |
userId |
String | User’s identifier |
status |
String | Order status (e.g., PENDING, SHIPPED) |
Notice that both User and Order share the same userId and can be logically grouped by it. This is our first clue.
Now, let’s create a GSI. We’ll call it UserOrdersGSI.
- GSI Name:
UserOrdersGSI - Partition Key:
gsi_pk - Sort Key:
gsi_sk
The magic happens in how we populate gsi_pk and gsi_sk for both entity types.
For a User item:
pk:USER#123sk:METADATAuserId:123name:Alicegsi_pk:USER#123(Same as the table’spk)gsi_sk:USER#METADATA(A prefix indicating it’s user metadata)
For an Order item:
pk:USER#123sk:ORDER#ABCorderId:ABCuserId:123status:PENDINGgsi_pk:USER#123(Same as the table’spk)gsi_sk:ORDER#ABC(A prefix indicating it’s an order, followed by the order ID)
With this setup, our UserOrdersGSI will look like this:
| gsi_pk | gsi_sk | pk | sk | userId | name | orderId | status |
|---|---|---|---|---|---|---|---|
| USER#123 | USER#METADATA | USER#123 | METADATA | 123 | Alice | ||
| USER#123 | ORDER#ABC | USER#123 | ORDER#ABC | 123 | ABC | PENDING |
This allows us to query for all items related to a specific user by querying UserOrdersGSI with gsi_pk = USER#123. We can then use the gsi_sk to filter or sort.
Querying for a user’s metadata:
{
"TableName": "YourTableName",
"IndexName": "UserOrdersGSI",
"KeyConditionExpression": "gsi_pk = :pk AND gsi_sk = :sk",
"ExpressionAttributeValues": {
":pk": {"S": "USER#123"},
":sk": {"S": "USER#METADATA"}
}
}
Querying for all orders for a user:
{
"TableName": "YourTableName",
"IndexName": "UserOrdersGSI",
"KeyConditionExpression": "gsi_pk = :pk AND begins_with(gsi_sk, :sk_prefix)",
"ExpressionAttributeValues": {
":pk": {"S": "USER#123"},
":sk_prefix": {"S": "ORDER#"}
}
}
Querying for a specific order for a user:
{
"TableName": "YourTableName",
"IndexName": "UserOrdersGSI",
"KeyConditionExpression": "gsi_pk = :pk AND gsi_sk = :sk",
"ExpressionAttributeValues": {
":pk": {"S": "USER#123"},
":sk": {"S": "ORDER#ABC"}
}
}
The core idea is using a composite key for your GSI’s partition and sort keys. The gsi_pk often mirrors the base table’s partition key, allowing you to group related entities by their primary identifier. The gsi_sk is where the discrimination happens. By using prefixes (like USER# or ORDER#) and then appending the specific ID, you create distinct "slots" within the GSI’s sort key space for each entity type.
This technique is incredibly powerful for consolidating access patterns. Instead of needing a GSI for "all users" and another for "all orders by user," you can combine them. This directly translates to fewer GSIs, which means lower provisioned throughput costs (as each GSI consumes its own RCU/WCU) and reduced data storage costs (as GSIs store copies of the projected attributes). The main trade-off is increased complexity in designing and querying your GSIs, as you need to be mindful of the prefixes and the types of queries you intend to run.
What most people miss is that you’re not limited to just two entity types per GSI. You can pack many more, as long as you can devise a clear, hierarchical, and queryable naming convention for your gsi_sk that distinguishes between them. For example, you might have USER#<userId>, ORDER#<orderId>, PRODUCT#<productId>, or even more granular types like USER#<userId>#PROFILE versus USER#<userId>#PREFERENCES, all within the same gsi_pk and gsi_sk combination.
The next logical step is understanding how to handle query results that contain mixed entity types and how to efficiently deserialize them back into their respective application objects.