Agent Tool Pagination Documentation Specification
A paginated tool consumed by AI agents must be documented with an opaque cursor scheme, explicit per-call and per-task page-size limits, a deterministic ordering, an explicit has_more boolean, and a termination guarantee. Cursor-based pagination is REQUIRED; offset/page-number is RESERVED for small, stable collections.
TL;DR
Agents do not browse — they iterate. A paginated tool that an autonomous agent calls must guarantee that (1) the cursor is opaque and stable, (2) ordering is deterministic and documented, (3) page size is bounded both per-call and per-task, (4) termination is explicit (has_more: false), and (5) the schema does not change mid-iteration. This spec lists the required documentation fields, an example contract, and the failure modes that motivate each rule.
Scope
This spec applies to any tool surface, including MCP server tools and HTTP APIs invoked through agent function calling, that returns more results than fit in a single response. It complements the agent webhook documentation specification (asynchronous deliveries) and agent output validation specification (per-record schema validation).
Why pagination is different for agents
Human clients tolerate ambiguous pagination because a person at a screen can recover from surprises: a missing record, a duplicated row, a silent reordering. Agents cannot. They iterate in a loop with limited working memory, and any inconsistency in the page sequence either truncates the answer or burns the context window with redundant content. Three properties matter most:
- Determinism. The same request must return the same items in the same order, even if the underlying data is mutating.
- Boundedness. Token budgets are finite. The agent must be able to predict how many pages it will need before it starts.
- Termination. The loop must end on an explicit signal, not on guesswork about whether "empty page" means "done" or "transient gap".
Required Documentation Fields
1. Pagination scheme
Document the scheme as one of:
| Scheme | When to use | Agent-safe |
|---|---|---|
| Cursor (opaque token) | Default for any open-ended or mutable collection | Yes |
| Keyset (sortable column) | When clients need to resume from a known sort key | Yes, if cursor is opaque to the agent |
| Offset / page number | Small, immutable, finite lists (e.g. enum tables) | Discouraged — see common mistakes |
Cursor pagination is the default in modern agent-facing APIs (Stripe's REST API and GitHub's REST API both expose cursor-style schemes, GitHub via Link headers as defined in RFC 8288). Document which one your tool uses; never mix them in the same endpoint.
2. Cursor format
Cursors MUST be:
- Opaque to the caller. The agent never parses the cursor; it round-trips the value as-is.
- Stable across pages. A cursor returned in page N is the only valid input for page N+1.
- Time-bounded. Document the validity window (typical: 24 hours). Document the error code returned for an expired cursor.
- Single-use-safe. Document whether re-using the same cursor returns the same page (idempotent, RECOMMENDED) or errors.
Do not encode user-meaningful information (offsets, IDs, timestamps) in plain text inside the cursor. Agents will attempt to parse and increment them, producing pathological behaviour.
3. Page-size controls
Document both bounds:
- page_size request parameter, with default and maximum values (typical: default 25, max 100).
- A per-task page-size cap that agents should respect to avoid context-window exhaustion. Recommend it explicitly: "For agent use, set page_size to 20 or below."
4. Ordering
Document the sort order as a tuple of fields, primary and tiebreaker. Example: "results are ordered by created_at descending, with id as a deterministic tiebreaker". A single-field sort is insufficient if the field is not unique — ties produce non-deterministic ordering across pages.
5. Termination signal
The response MUST include an explicit has_more boolean (or equivalent). Do not require the agent to infer termination from response length, missing cursor, or HTTP status. The two valid termination shapes are:
{ "data": [...], "next_cursor": "abc123", "has_more": true }{ "data": [...], "next_cursor": null, "has_more": false }The redundancy of next_cursor: null and has_more: false is intentional: agents check has_more first, and the null cursor is a safety net.
6. Stability under mutation
Document what happens if the underlying data changes during iteration:
- Snapshot (RECOMMENDED). The cursor pins a snapshot; later inserts/deletes are not visible until the agent restarts iteration. Predictable, idempotent, agent-safe.
- Live. New items may appear in later pages or be skipped entirely. Acceptable only if explicitly documented and the agent's task tolerates it (e.g. log-tailing).
- Mixed. Forbidden. If you cannot commit to one model, the contract is not agent-safe.
7. Total counts
Document whether a total field is provided:
- For finite, bounded collections: include total so agents can budget pages up front.
- For unbounded or expensive-to-count collections: omit total and provide has_more only. Do not return an estimated total without flagging it as such.
8. Errors
Document these error cases with codes:
- cursor_expired (cursor no longer valid; agent restarts from page 1)
- cursor_invalid (malformed or tampered cursor; non-retryable)
- page_size_exceeds_max (request rejected; agent reduces and retries)
- rate_limited (with retry-after; standard back-off)
Worked Example: search_documents v1
Request
GET /v1/documents?query=auth&page_size=20&cursor=eyJpZCI6Ii4uLiJ9Response
{
"data": [
{ "id": "doc_2a91", "title": "OAuth setup", "updated_at": "2026-05-02T10:00:00Z" }
],
"next_cursor": "eyJpZCI6ImRvY18yYTkxIn0",
"has_more": true,
"page_size": 20,
"ordering": "updated_at desc, id asc"
}An agent iterating this tool MUST loop while has_more === true, pass next_cursor unchanged on the next call, and stop when has_more becomes false. The agent SHOULD also stop when it has accumulated enough information to answer the user's question, even if more pages remain.
Common Mistakes
- Offset pagination on a mutable collection. Inserts and deletes between pages cause skipped or duplicated rows. Cursor pagination eliminates this.
- Encoding offsets inside the cursor in plain text. Agents will reverse-engineer and increment them, defeating the opacity contract.
- No tiebreaker on the sort field. Ties between rows with the same primary sort key produce non-deterministic order. Always document a unique tiebreaker.
- Implicit termination. Returning an empty page or null next_cursor without has_more: false forces the agent to guess. Agents that guess wrong either loop forever or stop early.
- Schema drift mid-iteration. Adding or removing fields between pages breaks the agent's parsing. Schema versions are pinned by the cursor or by an explicit version header for the duration of an iteration.
- Unbounded page_size. Lets an agent ask for a million rows and crash either side. Always cap.
- No per-task budget guidance. Agents will pick the maximum page size when allowed. Document the recommended agent-friendly value separately from the technical maximum.
FAQ
Q: Can I document offset and cursor pagination for the same endpoint?
No. Mixed schemes confuse agents and double the surface area for bugs. Pick one. If you must support both for legacy reasons, expose them as distinct endpoints and clearly mark which one is agent-recommended.
Q: How long should a cursor stay valid?
Long enough to outlast the agent's longest reasonable iteration. 24 hours is a common default; one hour is the minimum for most agent workloads. Document the exact value and the error returned on expiry.
Q: Should I expose a prev_cursor?
Only if your collection is genuinely bidirectional (e.g. timeline scrolling). Most agent workloads iterate forward only, and a prev_cursor doubles the test surface. Default to forward-only.
Q: How do I document pagination for an MCP tool?
The same fields apply, exposed in the tool description. Make cursor an optional argument and return next_cursor, has_more, and (if applicable) total in the structured content. Reference this spec from the tool description so agents know the contract.
Q: What about streaming or unbounded results?
Those are not pagination — they are streams. Document them with a separate spec covering chunking, keep-alives, and graceful termination. Do not retrofit cursor semantics onto a stream.
Related Articles
Agent Context Window Budgeting Specification
Agent context window budgeting spec: token allocation buckets, summarization triggers, eviction policies, prompt caching pairing, and worked examples.
Agent Long-Running Job Documentation Specification
Specification for documenting long-running agent jobs: async kickoff, status polling, SSE progress, cancellation, and timeout SLAs that AI agent tools must publish.
Agent Output Validation Documentation Specification
A specification for validating AI agent outputs against JSON Schema with runtime hooks, error formats, and partial-output handling for tool builders.