Designing DynamoDB GSIs Backwards from Query Requirements
The solution to "What question must DynamoDB answer in one Query call?"
One common mistake engineers make when working with DynamoDB is approaching Global Secondary Indexes the same way they would approach indexes in a relational database.
In SQL, indexes are often added reactively, once slow queries appear in production.
DynamoDB works in the opposite direction.
If you design GSIs without knowing exactly how they will be queried, you almost always end up with inefficient access patterns, unnecessary filters, and higher costs.
The correct mental model is to design GSIs backwards.
Instead of starting from your data model, you start from the exact questions your application needs to ask, and then shape the index keys to answer those questions directly.
In this article, I lay out a six-step structure to designing your indexes that will yield scalable, performant and cost-efficient database queries.
Step 1: Start With the Query, Not the Data
Before thinking about attributes, entity types, or table structure, you should clearly articulate the queries your application must perform.
DynamoDB is optimized for known access patterns, not ad-hoc modelling. That means every query should be intentional.
Take for example the following query:
“Get all posts for a user ordered by creation date”.
It imply two things:
what groups the data together
how the results should be ordered.
DynamoDB expresses those two dimensions through the partition key and sort key.
If you cannot clearly describe your access pattern using those terms, your design is not ready yet.
DynamoDB will not adapt later the way a relational database might.
Step 2: Derive the Partition Key from the Query Scope
Once the query is defined, the partition key should represent the natural boundary of that query.
If you are fetching posts for a user, the user is the scope (user partition).
If you are fetching users for an organization, the organization is the scope.
For example, if your requirement is to fetch all posts belonging to a single user, the partition key should encode the user identifier. The sort key can then be used to impose ordering or filtering within that user’s data.
GSI1PK = user#<userId>
GSI1SK = post#<createdAt>With this structure, querying becomes straightforward and efficient.
const params = {
TableName: “app”,
IndexName: “GSI1”,
KeyConditionExpression: “GSI1PK = :pk”,
ExpressionAttributeValues: {
“:pk”: “user#123”
},
ScanIndexForward: false //descending order
};This query touches only the data it needs.
There are no filters, no scans, and no wasted capacity.
Step 3: Use the Sort Key to Encode Meaning
The sort key is far more powerful than a simple timestamp or ID. It can encode multiple dimensions of meaning in a single value.
This is one of DynamoDB’s most important modelling techniques.
Instead of storing type and time as separate attributes, you can encode them together in the sort key. This allows you to efficiently query by prefix, order by time, and paginate without additional overhead.
GSI1SK = post#2025-01-10T12:00:00ZNow you can fetch only posts while preserving chronological ordering.
const params = {
TableName: “app”,
IndexName: “GSI1”,
KeyConditionExpression:
“GSI1PK = :pk AND begins_with(GSI1SK, :prefix)”,
ExpressionAttributeValues: {
“:pk”: “user#123”,
“:prefix”: “post#”
}
};This design avoids filtering entirely and ensures that DynamoDB does the minimum amount of work possible.
Step 4: Avoid Designing “Generic” GSIs
A frequent anti-pattern is creating GSIs that appear flexible but lack a clear purpose.
An index like status as the partition key and createdAt as the sort key may seem useful, but it usually leads to hot partitions and inefficient queries.
The problem is not the attributes themselves, but the question they attempt to answer.
A query like “Get everything with status = active” is too broad to scale well in DynamoDB.
A better approach is to narrow the scope of the question.
Instead of asking for all active items globally, ask for active items within a known boundary, such as an account or organization.
GSI2PK = account#<accountId>#subscriptions
GSI2SK = active#<createdAt>This design distributes load evenly and ensures predictable performance.
Step 5: Treat GSIs as Query Contracts
Every GSI you add to a table is a long-term commitment. It consumes write capacity, storage, and operational complexity.
Because of that, each index should exist for a specific, well-understood query.
You should be able to point to application code and say exactly where and why the index is used. If you cannot answer who queries it, how often it is queried, and what order the results must be returned in, the index likely does not belong in your table.
Designing GSIs backwards enforces this discipline naturally.
Step 6: Design for the “No Filter” Rule
A strong signal of a well-designed GSI is that queries rely exclusively on KeyConditionExpression.
Filters are evaluated after data is read, which means they cost capacity without reducing reads.
If you frequently find yourself adding filter expressions to remove unwanted entity types, that logic belongs in the key design instead.
Conclusion
Designing GSIs backwards from query requirements forces you to think clearly about how your application actually uses data.
Instead of indexing attributes “just in case,” you design intentional access paths that are fast, predictable, and cost-efficient.
When you shift your mindset from “What should I index?” to “What question must DynamoDB answer in a single query?”, DynamoDB stops feeling limiting and starts feeling purpose-built.
👋 My name is Uriel Bitton and I’m committed to helping you master Serverless, Cloud Computing, and AWS.
🚀 If you want to learn how to build serverless, scalable, and resilient applications, you can also follow me on Linkedin for valuable daily posts.
Thanks for reading and see you in the next one!


