How To Perform Powerful Filtering In DynamoDB Using The Sort Key
Filtering in DynamoDB starts with your sort keys not FilterExpressions
Most DynamoDB users get this wrong.
If you’ve been filtering your DynamoDB data with FilterExpressions, you’re doing it wrong.
It is inefficient, costing you a lot, and greatly slowing down your queries.
Instead, you should be using your sort key in your queries to filter data.
You’ll get 100% query and cost efficiency as well as high performance.
I’ll show you how to do this using only the sort key, but first we must understand why we cannot use FilterExpressions.
Why you shouldn’t use FilterExpression
When you run a query, DynamoDB will fetch the data that matches your primary key (partition + sort key).
All of the items that match will be fetched (limited to 1MB of data). If you add a filter expression, the filtering will be applied on the data already fetched.
This effectively means, your filter has no cost or latency optimization (as it should have). The entire dataset is already fetched, unfiltered, and returned to you filtered.
You could require only 10 items but will pay for potentially 100 items (in latency and costs).
Imagine a hotels database which store rooms records and include the room’s view type:
If I want to filter by the rooms by “sea” views, you could run this filter expression:
I see a message saying the efficiency of my query was 40% — not great. It had to indeed fetch 5 items but return 2.
However, there is a much more efficient way to filter your data, one that requires a much more careful design.
We can design our data so that in the case above, if we need the 2 sea views, we will only fetch those 2 items.
How to Filter with the Sort Key
Let’s design our sort keys to achieve powerful filtering.
We’ll integrate the view type directly in our sort keys:
Instead of:
pk: hotel#101
sk: room#201
We do:
pk: hotel#101
sk: room#sea#201
Now we can efficiently filter our rooms by view type.
If we want to get all sea views, we can run the following query:
const params = {
TableName: "hotels",
KeyConditionExpression: "pk = 'hotel#101' AND begins_with(sk, 'room#sea#');
We use the begins_with() method and add the string value “room#sea”. Since all rooms with a sea view have the sort key value prefix of “room#<view>” we can simply pass in the view type into our query and have it filter for that view type.
Here’s the query being run in the console:
The result and query efficiency:
We can see now the query efficiency is 100%.
All of the data being fetched is actually returned.
If we need all garden views, we can simply pass in garden in the sort key begins_with() method:
That returns the single garden view room.
Conclusion
By designing your sort keys to include filterable attributes, you can enable powerful performance in your DynamoDB queries.
This design strategy eliminates unnecessary data fetching and makes sure all items retrieved are items you actually need, resulting in 100% query efficiency.
Stop relying on DynamoDB FilterExpressions and start architecting your sort keys for powerful and cost-effective filtering.
👋 My name is Uriel Bitton and I’m committed to helping you master Serverless, Cloud Computing, and AWS.
🚀 If you want to learn how to build serverless, scalable, and resilient applications, you can also follow me on Linkedin for valuable daily posts.
Please consider a paid subscription if you found this post valuable.
Thanks for reading and see you in the next one!