Agent Idempotency Documentation Specification

Agent idempotency documentation must declare which tools are read-only, naturally idempotent, key-based idempotent, or non-idempotent. For key-based tools, docs must specify the Idempotency-Key header field, TTL, replay response shape (200 with cached body, 201 only on first write, 409 on key reuse with different payload), and compensation paths for non-idempotent operations.

TL;DR

Autonomous agents retry. Without explicit idempotency contracts, those retries duplicate orders, double-charge cards, and create ghost records. This spec defines what every agent-facing tool MUST document: a four-class idempotency taxonomy, an Idempotency-Key field with TTL and scope, the exact response status and body shape for replays, anti-patterns to flag, and machine-checkable assertions a manifest validator can run before publish. Pair with the Tool Use Documentation Spec and the Error Handling Spec. For the broader hub see /ai-agents/.

1. Definition

Agent idempotency documentation is a structured contract attached to a tool, function, or MCP server method that tells an autonomous agent — and any planner, retry middleware, or orchestrator above it — whether re-executing the same call is safe, and if not, what mechanism (key, conditional header, compensation) the agent must use to make it safe.

The contract has five required parts:

Idempotency class drawn from a fixed taxonomy.
Key field schema when the class requires a client-supplied key.
Key TTL and scope so the agent knows how long replays are honored and where the key uniqueness applies.
Replay response shape specifying the exact status code and body for first-write versus replay versus key-conflict.
Compensation path when the operation cannot be made idempotent.

A tool whose docs omit any of these parts is undocumented for autonomous use. Reviewers MUST reject the manifest.

2. Why it matters

Autonomous agents differ from human-driven clients in three ways that make idempotency documentation non-optional:

Retries are automatic. Frontier model orchestrators (Claude tool use, OpenAI function calling, Gemini function calling) retry on network errors, timeouts, and partial JSON. If the underlying tool is not retry-safe and the docs do not say so, the agent will re-execute it.
Plans are persisted. Agent frameworks like the Anthropic Agent SDK and the Model Context Protocol let plans resume across processes. A resumed plan that reissues a create order call has no way to know the original succeeded unless the tool's response carries idempotent semantics.
Failures are partial. Network or process failure between request and response is the common case at scale. The agent receives no answer; the server may have committed. Without an idempotency contract, the agent cannot recover safely.

Industry references converge on the same answer: declare safety in machine-readable form, support a client-supplied key for unsafe writes, and expose the replay response so clients can detect duplicates. Stripe's idempotency model is the most-copied prior art and is explicitly designed for retry-driven clients. RFC 9110 §9.2.2 codifies the HTTP-level distinction between idempotent and non-idempotent methods that all agent tooling inherits.

3. How it works

The contract operates at three layers.

3.1 The four-class idempotency taxonomy

Every tool MUST declare exactly one class:

Class	Definition	Retry guidance for the agent
read_only	No server state change. Safe to retry indefinitely.	Retry freely.
naturally_idempotent	State change is, by construction, the same on every call (e.g. PUT /resources/{id} with full body, or cancel order if not cancelled).	Retry freely; compare response.
key_idempotent	State change is non-idempotent unless a client-supplied key is provided. Server deduplicates by key within TTL.	Retry only with the same key.
non_idempotent	State change cannot be made idempotent. A retry creates a duplicate.	Do not retry; use the documented compensation path.

The class is the single most important field. An agent's planner uses it to decide whether to retry on timeout at all.

3.2 The idempotency-key field schema

For key_idempotent tools, docs MUST include a field block:

idempotency:
  class: key_idempotent
  key:
    name: "Idempotency-Key"
    location: "header"          # header | query | body
    type: "string"
    format: "uuid-v4 | opaque"
    max_length: 64
    required: true
  ttl_seconds: 86400            # 24h is the conventional default
  scope: "account"              # account | user | tenant | global
  conflict_on_payload_change: 409

scope is the most-overlooked field. Per-account scope means two different accounts may legally reuse the same key string; global scope means the key namespace is shared. Stripe's idempotency docs use account-scoped keys with a 24-hour TTL — that pair is the safest default unless the operation is privileged.

3.3 Replay response shape

Docs MUST specify the exact response shape for four cases. The contract is tightest here because agents key their retry decisions on status:

Case	HTTP status	Body	Replay-detection header
First write succeeds	201 Created (or domain-appropriate 2xx)	Created resource	Idempotency-Replay: false
Replay with same key + same payload	200 OK	Cached response from first write, byte-identical	Idempotency-Replay: true
Replay with same key + different payload	409 Conflict	Error body explaining payload mismatch	Idempotency-Conflict: payload-mismatch
Replay before first write completes	409 Conflict (or 425 Too Early)	Error body indicating in-flight	Idempotency-Conflict: in-flight

The Idempotency-Replay header lets an agent's downstream logic distinguish I just made this happen from this had already happened. Without it, the agent must compare bodies, which is brittle.

4. Key concepts

4.1 TTL guidance

24 hours — safe default for transactional writes (orders, payments, support tickets).
7 days — long-running agent workflows that may resume after restart.
5 minutes — high-throughput, low-latency operations where the deduplication window is intrinsically narrow (rate-limited counters, ephemeral session creation).

Document the TTL precisely. Never use "indefinite"; storage cost forces eventual eviction, and an agent that assumes infinite TTL will eventually duplicate.

4.2 Scope

The scope answers within what tenant boundary is this key unique? The default for B2B APIs is account. The default for multi-user consumer APIs is user. The default for global infrastructure operations is tenant. Document the choice and the reason; agents that orchestrate across tenants need the answer to deduplicate correctly.

4.3 Compensation path for non_idempotent tools

Some operations cannot be retried safely. A send_email that has already left an outbound queue, a dispatch_drone that is in the air, a transfer_funds that has settled. For these, docs MUST provide a compensation triplet:

A reversal endpoint (e.g. cancel_send, recall_drone, reverse_transfer).
A documented detection method for did the original succeed? — a status query, a webhook, an idempotent get.
A maximum reversal window.

Without all three, the tool is unsafe for autonomous use and SHOULD be flagged in the manifest as agent_safe: false.

5. OpenAPI extension example

Tools published as OpenAPI specs MUST embed idempotency under a vendor extension. The recommended shape:

paths:
  /v1/orders:
    post:
      operationId: createOrder
      x-agent-idempotency:
        class: key_idempotent
        key_field: "Idempotency-Key"
        key_location: header
        ttl_seconds: 86400
        scope: account
        replay_header: "Idempotency-Replay"
        conflict_status: 409
      parameters:
        - in: header
          name: Idempotency-Key
          schema: { type: string, maxLength: 64 }
          required: true
      responses:
        "201": { description: "First write" }
        "200": { description: "Replay; body identical to original 201" }
        "409": { description: "Key reused with different payload" }

The x-agent-idempotency block is the machine-readable contract. Agent toolchains (LangGraph, the Anthropic Agent SDK, custom MCP servers) parse this block to decide retry behavior automatically.

6. Worked example: agent retry loop with partial failure

sequenceDiagram
    participant Agent
    participant Tool as Order API
    Agent->>Tool: POST /v1/orders (Idempotency-Key K1, payload P)
    Note right of Tool: Server commits order O1
    Tool--xAgent: Network failure (no response)
    Note left of Agent: Timeout — retry with same K1, same P
    Agent->>Tool: POST /v1/orders (Idempotency-Key K1, payload P)
    Tool->>Agent: 200 OK + Idempotency-Replay true + body of O1
    Note left of Agent: Plan continues with O1; no duplicate created

The loop only works because every party honored the contract: the agent kept the same key on retry, the server cached the response, and the response declared itself a replay.

If the agent had retried with a new key, the server would have created O2 — a duplicate order — because deduplication is keyed, not content-based.

7. Anti-patterns to flag in review

Anti-pattern	Why it breaks autonomy
"Retries safe within a few seconds"	Vague TTL; agents that resume across processes will exceed it silently.
Idempotency by request-body hash	Different payload formatting (whitespace, key order) yields different hashes; agents using JSON serializers cannot rely on it.
200 returned for both first write and replay with no replay marker	Agents cannot distinguish; downstream logic re-fires side effects.
Key required only on writes flagged "important"	Agents cannot tell which writes are important; either key all or document naturally_idempotent.
Implicit per-IP scope	Agent fleets share egress IPs; collisions become inevitable.
TTL longer than the storage window	Server starts returning 201 Created on stale keys, silently duplicating.
Non-idempotent tool with no compensation endpoint	Tool is fundamentally unsafe for autonomous use.

A manifest validator MUST refuse to publish a tool whose declared class is key_idempotent but whose docs omit any of: key_field, ttl_seconds, scope, replay status, conflict status.

8. Idempotency vs adjacent concepts

Concept	What it covers	What it does not
Idempotency	Same call → same observable outcome (this spec)	Concurrent calls; ordering
Atomicity	Operation either fully commits or not at all	Replay safety on retry
Exactly-once delivery	Message bus property between producer and consumer	Application-level write semantics
Optimistic concurrency	Conflict detection on concurrent edits via version tokens	Deduplication of identical retries
Rate limiting	Throttling request volume	Whether retried writes duplicate

Agents need all of these documented separately. The Rate Limit Documentation Checklist covers throttling; this spec covers idempotency only.

9. Common misconceptions

"GET is always safe so I do not need to document it." True for state, but agents still need to know the response is cacheable, that there is no rate-limit penalty for retries, and whether the response is deterministic.
"My DB has unique constraints; that's idempotency." Unique constraints surface conflicts as errors. Idempotency requires returning the original successful response on replay, not an error.
"The HTTP method tells the agent enough." RFC 9110 defines PUT and DELETE as idempotent at the protocol layer, but application-layer side effects (emails, webhooks, billing) often are not. The class declaration is the source of truth.
"Clients should generate keys however they want." Clients should, but the docs MUST specify accepted formats, max length, and whether the server will reject malformed keys.

10. How to apply this spec

Inventory every tool the agent can call. Tag each with one class from the taxonomy.
For every key_idempotent tool, fill the field block (key name, location, TTL, scope, replay markers).
For every non_idempotent tool, write the compensation triplet (reversal endpoint, detection method, reversal window). If you cannot, mark agent_safe: false.
Embed the x-agent-idempotency extension in OpenAPI specs and the equivalent block in MCP server manifests and Anthropic Agent Skills.
Run a machine-checkable assertion suite before publish:

assertions:
  - every_write_endpoint_has_class: true
  - key_idempotent_endpoints_specify_ttl: true
  - key_idempotent_endpoints_specify_scope: true
  - replay_response_status_documented: true
  - replay_response_body_byte_identical_to_first_write: true
  - non_idempotent_endpoints_have_compensation: true
  - manifest_validates_against_x_agent_idempotency_schema: true

Surface the contract in the human-readable docs page and in the manifest. Agents read both; reviewers read mostly the page; both must agree.
Test the replay path in CI. A test harness that issues the same call twice and asserts the replay header is the cheapest insurance against silent duplicates.

Agent Tool Use Documentation Spec — the parent doc for tool manifests.
Agent Error Handling Documentation Spec — error-shape contracts that idempotency depends on.
Agent Rate Limit Documentation Checklist — throttle contracts; complements retry safety.
Agent Skill Manifest Specification — where the idempotency block lives in Anthropic Skills.

Primary sources for this spec:

Stripe API Reference, Idempotent requests — https://docs.stripe.com/api/idempotent_requests
RFC 9110, HTTP Semantics §9.2.2 Idempotent methods — https://www.rfc-editor.org/rfc/rfc9110.html#name-idempotent-methods
OpenAI Platform, Function calling — https://platform.openai.com/docs/guides/function-calling
Anthropic, Tool use with Claude — https://docs.anthropic.com/en/docs/build-with-claude/tool-use
Model Context Protocol specification — https://modelcontextprotocol.io

FAQ

Q: Does idempotency belong in the tool manifest or only in human-readable docs?

A: Both. Frontier orchestrators read the manifest to make automatic retry decisions; reviewers read the docs page. The manifest is the source of truth for machines; the docs page must mirror it for humans. Drift between the two is the most common failure mode and is what the machine-checkable assertions in this spec are designed to catch.

Q: What TTL should I default to if I cannot pick one?

A: 24 hours is the safest default for transactional writes and matches the most-copied prior art (Stripe). Increase to 7 days only if your agent workflows persist plans across long-running processes; reduce to 5 minutes only if the operation is high-throughput and the deduplication window is intrinsically narrow.

Q: How does this differ from "exactly-once semantics" in message buses?

A: Exactly-once delivery is a transport property between a producer and a consumer of messages. Idempotency in this spec is an application-layer property of a tool's write endpoint. They are complementary: a message bus can deliver a tool invocation exactly once at the transport layer, but if the tool itself is not idempotent, downstream replay (process restart, plan resume) can still duplicate. The application-layer contract is the one agents must rely on.

Q: Should GET endpoints declare idempotency at all?

A: Yes — declare read_only. The class is the agent's signal that retries are unconditionally safe. Omitting it forces the agent's planner to assume the worst case and either retry conservatively or not at all, which leaves availability on the table.

Q: What is the minimum viable change for a team that has no idempotency docs today?

A: Add the four-class taxonomy field to every endpoint manifest. That alone unlocks safe retry decisions for read_only and naturally_idempotent tools and explicitly flags the rest as needing follow-up. Then add the key field block for the highest-risk write endpoints (payments, orders, communications) before tackling the long tail.

Q: How do I test that my replay response is byte-identical to the original?

A: Snapshot the first response body at write time, store it under the idempotency key alongside the response status and headers, and serve it verbatim on replay. In CI, fire the same request twice with the same key and assert byte equality on the response body and the Idempotency-Replay: true header on the second response.

Agent Idempotency Documentation Specification

TL;DR

1. Definition

2. Why it matters

3. How it works

3.1 The four-class idempotency taxonomy

3.2 The idempotency-key field schema

3.3 Replay response shape

4. Key concepts

4.1 TTL guidance

4.2 Scope

4.3 Compensation path for non_idempotent tools

5. OpenAPI extension example

6. Worked example: agent retry loop with partial failure

7. Anti-patterns to flag in review

8. Idempotency vs adjacent concepts

9. Common misconceptions

10. How to apply this spec

FAQ

Q: Does idempotency belong in the tool manifest or only in human-readable docs?

Q: What TTL should I default to if I cannot pick one?

Q: How does this differ from "exactly-once semantics" in message buses?

Q: Should GET endpoints declare idempotency at all?

Q: What is the minimum viable change for a team that has no idempotency docs today?

Q: How do I test that my replay response is byte-identical to the original?

Related Articles

Agent Error Handling Documentation Specification: Designing Errors Agents Can Self-Repair From

Agent Rate Limit Documentation Checklist: Disclosing Quotas, Retries, and Burst Limits to AI Agents

Agent Skill Manifest Specification: Publishing SKILL.md for AI Agent Discovery

Thông tin GEO & AI Search

Agent Idempotency Documentation Specification

TL;DR

1. Definition

2. Why it matters

3. How it works

3.1 The four-class idempotency taxonomy

3.2 The idempotency-key field schema

3.3 Replay response shape

4. Key concepts

4.1 TTL guidance

4.2 Scope

4.3 Compensation path for non_idempotent tools

5. OpenAPI extension example

6. Worked example: agent retry loop with partial failure

7. Anti-patterns to flag in review

8. Idempotency vs adjacent concepts

9. Common misconceptions

10. How to apply this spec

11. Related references

FAQ

Q: Does idempotency belong in the tool manifest or only in human-readable docs?

Q: What TTL should I default to if I cannot pick one?

Q: How does this differ from "exactly-once semantics" in message buses?

Q: Should GET endpoints declare idempotency at all?

Q: What is the minimum viable change for a team that has no idempotency docs today?

Q: How do I test that my replay response is byte-identical to the original?

Related Articles

Agent Error Handling Documentation Specification: Designing Errors Agents Can Self-Repair From

Agent Rate Limit Documentation Checklist: Disclosing Quotas, Retries, and Burst Limits to AI Agents

Agent Skill Manifest Specification: Publishing SKILL.md for AI Agent Discovery

Thông tin GEO & AI Search