Query Design
Filtering, sorting, pagination, and sparse fieldsets done right
In a nutshell
Most APIs need to let clients search, filter, sort, and page through lists of data. Query design is about choosing consistent patterns for these operations -- how clients filter results, which direction to sort, how to move through pages without missing items, and how to request only the fields they need. Getting this right once means every list endpoint in your API works the same way.
The situation
Your list endpoint returns all 12,000 records. The mobile app chokes on the payload. You add a limit parameter. Then someone needs sorting. Then filtering by status. Then filtering by date range. Then combining filters.
Six months in, your query parameters are an inconsistent mess: ?status=active&sortBy=created_at&sortDir=desc&limit=20&offset=40&fields=id,name. There's no standard, and every endpoint handles these parameters slightly differently.
Query design is the part of API design you don't think about until it's too late.
Filtering
Simple filters
For basic equality filtering, use query parameters that map to field names:
GET /api/tasks?status=active
GET /api/tasks?status=active&priority=high
GET /api/tasks?assignee_id=user_123{
"data": [
{
"id": "task_1",
"title": "Review PR #42",
"status": "active",
"priority": "high",
"assigneeId": "user_123"
}
],
"meta": { "total": 1 }
}Structured filters
When you need operators (greater than, less than, contains), use bracket syntax:
# Tasks created after a date with high priority
GET /api/tasks?filter[status]=active&filter[priority]=high&filter[created_after]=2026-04-01
# Orders above a certain amount
GET /api/orders?filter[total_gte]=100&filter[status]=confirmedThe bracket syntax (filter[field]) namespaces your filters, keeping them separate from other query parameters like sort or page. It's the convention used by JSON:API and many well-designed APIs.
Start simple
Don't build a full query language on day one. Start with equality filters (?status=active). Add operators (filter[total_gte]=100) only when consumers actually need them. You can always add parameters — you can't remove them.
Sorting
Use a sort parameter with field names. Prefix with - for descending:
# Newest first
GET /api/tasks?sort=-created_at
# By priority (ascending), then by creation date (newest first)
GET /api/tasks?sort=priority,-created_at{
"data": [
{ "id": "task_3", "title": "Deploy hotfix", "priority": "high", "createdAt": "2026-04-13T10:00:00Z" },
{ "id": "task_1", "title": "Review PR #42", "priority": "high", "createdAt": "2026-04-12T15:30:00Z" },
{ "id": "task_2", "title": "Update docs", "priority": "medium", "createdAt": "2026-04-13T08:00:00Z" }
]
}The - prefix convention is widely adopted and reads naturally: sort=-created_at means "sort by created_at, descending."
Don't allow sorting on unindexed columns
Every sortable field should have a database index. Allowing arbitrary sorting on any field is a performance footgun waiting to fire on a table with a million rows.
Pagination
This is where most APIs get it wrong. There are two approaches, and one is almost always better.
Offset pagination
The familiar pattern: page and pageSize (or limit and offset).
GET /api/tasks?page=3&page_size=20{
"data": [ "...20 tasks..." ],
"meta": {
"total": 247,
"page": 3,
"pageSize": 20,
"totalPages": 13
}
}The problem: offset pagination breaks when data changes between requests. If a new task is inserted while a user is paging through results, they'll either skip a task or see a duplicate. It also gets slower on large datasets — OFFSET 100000 means the database scans and discards 100,000 rows.
Cursor pagination
Instead of a page number, use an opaque cursor — a Base64-encoded string that encodes the position in the result set (typically the last item's sort key + ID):
# First page
GET /api/tasks?page[size]=20
# Next page (using cursor from previous response)
GET /api/tasks?page[size]=20&page[after]=eyJjcmVhdGVkQXQiOiIyMDI2LTA0LTEyVDE1OjMwOjAwWiIsImlkIjoidGFza18xMjMifQ{
"data": [ "...20 tasks..." ],
"meta": {
"hasNextPage": true,
"hasPrevPage": true,
"startCursor": "eyJjcm...",
"endCursor": "eyJjcm..."
}
}The cursor is typically a Base64-encoded combination of the sort field and the item ID, ensuring stable positioning even when new items are inserted.
Navigation links (HATEOAS pagination)
The real unlock with pagination is not making the frontend build URLs at all. Instead, return _links with pre-built navigation URLs:
{
"data": [ "...20 tasks..." ],
"meta": {
"hasNextPage": true,
"hasPrevPage": true
},
"_links": {
"self": { "href": "/api/tasks?page[size]=20&page[after]=abc123" },
"next": { "href": "/api/tasks?page[size]=20&page[after]=def456" },
"prev": { "href": "/api/tasks?page[size]=20&page[before]=abc123" },
"first": { "href": "/api/tasks?page[size]=20" }
}
}This is one of the few places where HATEOAS delivers real, practical value. The frontend just follows _links.next — it never constructs a pagination URL. This means:
- The backend owns pagination logic entirely — you can change from offset to cursor without touching the frontend
- No cursor format leaks — the frontend treats cursors as opaque strings
- Fewer bugs — the frontend can't build a malformed pagination URL because it never builds one
// Frontend pagination — no URL construction needed
async function loadNextPage(currentResponse) {
if (!currentResponse._links.next) return null;
const res = await fetch(currentResponse._links.next.href);
return res.json();
}Use _links for pagination, skip HATEOAS everywhere else
Full HATEOAS (adding _links to every resource for every possible action) is almost never worth the complexity. But for pagination specifically, _links with next/prev/first/last is genuinely useful and widely adopted. This is the pragmatic middle ground.
Default to cursor pagination
Cursor pagination is stable, performant, and works at any scale. Offset pagination is simpler to implement but breaks under real-world conditions. Use offset only for admin dashboards or reports where users genuinely need "go to page 7." For everything else, use cursors.
Comparison
| Aspect | Offset | Cursor |
|---|---|---|
| Jump to specific page | Yes | No |
| Stable across inserts/deletes | No | Yes |
| Performance at scale | Degrades with offset size | Constant |
| Implementation complexity | Lower | Moderate |
| Total count available | Yes (but expensive) | Not inherently |
| Best for | Admin UIs, reports | Feeds, infinite scroll, APIs |
Sparse fieldsets
Not every consumer needs every field. Sparse fieldsets let consumers request only the fields they care about:
GET /api/tasks?fields=id,title,status{
"data": [
{ "id": "task_1", "title": "Review PR #42", "status": "active" },
{ "id": "task_2", "title": "Update docs", "status": "done" }
]
}Without sparse fieldsets, the same response might include description, comments, attachments, activityLog, and 15 other fields the mobile client never renders. That's wasted bandwidth and wasted parsing time.
For APIs with related resources, use dot notation:
GET /api/tasks?fields=id,title&fields[assignee]=id,name,avatarAlways return the ID
Even when a consumer requests fields=name,email, always include the id field. Consumers need it for updates, caching, and deduplication. It's such a universal requirement that it shouldn't need to be requested explicitly.
Putting it all together
A well-designed query endpoint supports all four concerns in a single request:
GET /api/tasks?filter[status]=active&filter[priority]=high&sort=-created_at&fields=id,title,status,priority&page[size]=20&page[after]=cursor123{
"data": [
{ "id": "task_3", "title": "Deploy hotfix", "status": "active", "priority": "high" },
{ "id": "task_1", "title": "Review PR #42", "status": "active", "priority": "high" }
],
"meta": {
"hasNextPage": true,
"endCursor": "eyJjcmVhdGVk..."
}
}Clean, predictable, efficient. Every parameter follows a consistent pattern that consumers can learn once and apply across all your list endpoints.
Next up: error handling and status codes — because returning 200 for errors is a crime against HTTP.