Agent Replay Attack Prevention Spec

Agent replay attack prevention combines per-request nonces, idempotency keys with bounded TTL, request signing, and replay windows so no agent prompt or tool-call payload can be re-executed by an attacker after first delivery.

TL;DR

Per-request nonces let the receiver detect replayed prompts before any handler runs.
Idempotency keys with bounded TTL collapse retries safely without re-executing side effects.
Request signing (HMAC over body + timestamp + nonce) authenticates and binds payload to time.
Replay windows (default 5 minutes) plus tool-call dedupe defend against tool-payload replay across the agent loop.

Definition

A replay attack on an agent runtime is any reuse of a previously valid request — a user prompt, a tool-call payload, a streaming chunk, or an internal control message — to coerce the agent into re-executing an action. Replay is distinct from forgery: the attacker does not need to mint a new payload, only to capture and resend a real one. Agent replay attack prevention is the runtime discipline that makes every accepted message uniquely identifiable, time-bound, and authenticated, so any second delivery is rejected before it reaches a handler.

The defense rests on four primitives: a nonce (unique per request), an idempotency key (stable per logical operation, with TTL), a request signature (HMAC over body, timestamp, and nonce), and a replay window (the wall-clock interval inside which a signature is accepted). For agents that issue tool calls, the same primitives must wrap the tool payload, not only the inbound user prompt — otherwise an attacker who captures a tool call can replay a successful side-effecting action like a payment, a write, or a privileged read.

This specification defines the contract: which fields are mandatory, where they live in the request envelope, what TTLs are acceptable, and how the receiver should respond when a duplicate is detected.

Why this matters

Agent runtimes amplify the blast radius of replay attacks. A traditional API replay re-runs one endpoint; an agent replay can re-trigger an entire reasoning chain that issues many tool calls, each of which may have side effects. A captured agent prompt can be replayed to drain a budgeted balance, re-send notifications, or repeat an irreversible workflow such as filing a ticket, sending an email, or transferring funds.

Because most agents are exposed over long-lived channels (WebSockets, SSE, or browser-side fetch with shared session cookies) and because tool-call payloads are typically composed by the model and forwarded to a downstream service, the attack surface is unusually wide. An attacker who captures a single payload from a proxy, browser memory, or a logging system can replay it without re-authenticating, often inside a session that the receiver still considers live.

Replay defense also matters for correctness, not only security. Agents retry calls under flaky network conditions, and a network-induced retry is functionally identical to a malicious replay. Without idempotency keys and a replay window, a benign retry can double-charge a customer, post a duplicate row, or fan out a notification twice. The same primitives that defeat attackers also keep retries safe — a dual role formalized by Stripe's idempotent requests contract and widely reproduced by other payments and write APIs.

How it works

The receiver verifies four things per request, in order: signature, timestamp, nonce, and idempotency key. If any check fails, the handler is never invoked.

Primitive	Field	Receiver storage	TTL	Rejection reason
Signature	X-Agent-Signature: HMAC-SHA256(secret, ts + "." + nonce + "." + body)	none	n/a	signature_mismatch
Timestamp	X-Agent-Timestamp: unix-seconds	none	replay window (default 300s)	timestamp_outside_window
Nonce	X-Agent-Nonce: 128-bit random	seen-nonce cache (Redis/Memcached)	replay window	nonce_replayed
Idempotency key	Idempotency-Key: opaque-string	response cache	per-operation TTL (default 24h)	idempotency_replay (returns cached response)

Order matters. The signature check is first because it is cheap and gates everything else; without a valid signature, no further state is consulted. Timestamp comes next: any request older than the replay window is rejected without a cache lookup. The nonce cache is consulted only for in-window, validly signed requests, which keeps the cache size bounded by requests_per_window. Finally, the idempotency key is checked for operations that should be safely retryable; on a hit, the receiver returns the original response rather than rejecting, which is what makes legitimate retries safe.

For tool calls inside the agent loop, the agent runtime is the signer and the tool service is the receiver. The agent must include a fresh nonce and timestamp on every tool invocation and must persist its own seen-nonce set if it accepts tool callbacks. Streaming responses use a per-stream session id plus a monotonically increasing chunk sequence; out-of-order or repeated chunks within a session are rejected. The reconnect contract for SSE is formalized by the Last-Event-ID header in the WHATWG HTML Living Standard, which lets a client resume from the last seen sequence without re-accepting earlier chunks.

Practical application

The minimum viable wiring for a Python agent service uses four pieces of middleware in the request pipeline.

def verify_request(req, secret, replay_window=300):
    ts = int(req.headers["X-Agent-Timestamp"])
    nonce = req.headers["X-Agent-Nonce"]
    sig = req.headers["X-Agent-Signature"]
    if abs(time.time() - ts) > replay_window:
        raise Reject("timestamp_outside_window")
    expected = hmac_sha256(secret, f"{ts}.{nonce}.{req.body}")
    if not hmac.compare_digest(sig, expected):
        raise Reject("signature_mismatch")
    if seen_nonce_cache.exists(nonce):
        raise Reject("nonce_replayed")
    seen_nonce_cache.setex(nonce, replay_window, "1")

For idempotency, wrap the actual handler so that the first run records the response and any subsequent run with the same key returns the recorded response without re-executing side effects. Stripe's idempotency contract is the canonical reference: keys live for 24 hours, the recorded response is returned verbatim on a duplicate, and the cached response is keyed on a hash of the request body so a same-key-different-body retry is rejected as a conflict (Stripe API: Idempotent Requests).

For tool calls, wrap the agent's tool dispatcher so that every outbound call carries the four headers and so that tool service responses are verified the same way on the way back. For OpenAI-style function calling, generate the nonce and timestamp at the point the agent commits to the tool call, not at planning time, to minimize the exploit window. For Anthropic Messages API tool use, the same principle applies; both vendors expose stable request IDs you can incorporate into the idempotency key (OpenAI API Reference, Anthropic Messages API), but those vendor IDs are not a substitute for an end-to-end signature you control.

Common mistakes

Verifying the signature but not the timestamp: an attacker can replay a validly signed request indefinitely.
Caching nonces without TTL: the cache grows unbounded and eventually evicts legitimately fresh nonces under pressure.
Using the same secret across environments: a captured staging payload becomes a production replay.
Signing only the headers and not the body: the attacker can swap the payload for the same envelope.
Treating idempotency keys as nonces: keys are scoped to a logical operation and intentionally outlive a single request, while nonces are single-use.
Skipping replay defense on tool callbacks: many agents protect inbound user prompts but leave the tool side wide open, which is typically the higher-value target.
Returning a generic 400 for all replay rejections: the client cannot distinguish a stale clock from a malicious replay, which makes diagnostics painful.

FAQ

Q: How is a nonce different from an idempotency key?

A nonce is a single-use, opaque value that the receiver remembers for the length of the replay window; its job is to detect any second delivery of the same request. An idempotency key is a stable identifier for a logical operation — the same key is intentionally reused on retry so the receiver can return the original response. Nonces defend against replay; idempotency keys make retries safe. Production systems use both because they answer different questions.

Q: What replay window TTL should I use?

Default to 300 seconds (5 minutes) for end-user requests and 60 seconds for service-to-service tool calls. Shorter windows shrink the attacker's exploit budget but require tighter clock sync; longer windows tolerate clock drift but enlarge the seen-nonce cache. If clients live behind NAT or mobile networks where clocks drift, 600 seconds is a reasonable upper bound, but never accept timestamps from the future beyond a small skew (30 seconds is typical).

Q: Should I sign the full tool-call payload or only the agent prompt?

Sign both, independently. The agent prompt envelope protects the inbound side; the tool-call envelope protects the outbound side. Signing only the prompt leaves the tool service trusting whatever the agent forwards, which is exactly the surface an attacker who has compromised the model output channel will target. Each envelope should carry its own nonce and timestamp so the two surfaces can be audited and rate-limited independently.

Q: How do I handle replay defense for streaming / SSE responses?

Treat the stream as a single signed session establishment plus per-chunk sequence numbers. The session id and HMAC are validated on the initial handshake; each chunk carries a monotonically increasing sequence number and the receiver rejects any out-of-order or duplicate sequence within the session. On reconnect, the client supplies the last seen sequence and the server resumes from the next one — the pattern formalized by SSE's Last-Event-ID header in the WHATWG HTML Living Standard.

Q: What happens when the nonce or idempotency-ID space is exhausted?

A 128-bit random nonce gives roughly 3.4×10³⁸ values, so collision is statistically negligible inside any realistic replay window. For idempotency keys, choose a 128-bit UUID or larger; if the client generates keys deterministically from request content, hash the content with SHA-256 to get a 256-bit key. Cache size, not ID space, is the real constraint — size your seen-nonce cache for peak_RPS × replay_window plus headroom.

Q: Should a duplicate request return the original response or an error?

It depends on which primitive triggered the rejection. A nonce or signature replay should return an error (HTTP 409 or 401) because a duplicate at that layer is, by construction, malicious or a client bug. An idempotency-key match should return the original cached response because that is the entire point of the key — to make legitimate retries indistinguishable from the original. Surface a distinct error code per rejection class so client diagnostics are unambiguous.