Agent-Ready Documentation Checklist: Pre-Publish Audit for Autonomous AI Agents

Before publishing docs that autonomous AI agents will read, walk this 35-point pre-publish checklist across six areas: structure, retrieval, tool-use signals, error states, governance, and verification. Failing two or more items in any single category is a release blocker.

TL;DR

Autonomous agents (Claude, ChatGPT workspace agents, Devin, Cursor, Kiro) read your docs the way a junior engineer with no Slack access does: only what is on the page. If your docs lack stable anchors, machine-readable tool descriptions, error states, and an explicit retrieval path, agents will either guess wrong or call your support team. This 35-point checklist is the release gate. Score each item pass/fail; ship only after all six categories clear the threshold.

How to use this checklist

Run this checklist on every doc page or doc-set release. Mark each item pass (✅) or fail (❌). For each category there is a release-gate threshold. If two or more items fail in any single category, do not publish until they are fixed. Tooling like Mintlify's LLM-optimization layer can automate parts of this, but the human checklist below is the auditable record.

For narrative context on agent-friendly docs, read our agent-friendly documentation guide. For the specification this checklist enforces, see AI agents content specification.

Category 1 — Structure and parseability (release gate: ≤5 pass of 6)

Single H1 per page matching the page title verbatim. Agents anchor on the H1 to confirm context.
Strict heading hierarchy (H1 → H2 → H3, no skips). LLMs lose document structure when levels skip.
Stable, kebab-case anchor IDs on every H2 and H3. Anchors must not change across versions; agents cite by anchor.
Plain Markdown source available at a stable URL (/page.md or Accept: text/markdown). HTML-only docs lose roughly 30% of token budget to layout.
No critical content in screenshots or images. Anything an agent must read should also be in text. Add alt text for diagrams that supplements the prose.
Code blocks tagged with language identifiers. Untagged blocks lose syntax context; agents handle bash, json, ts correctly only when fenced and labeled.

Category 2 — Retrieval and discovery (release gate: ≤5 pass of 6)

llms.txt at the doc-site root lists canonical pages, API references, and tutorials. Validate against our llms.txt spec.
llms-full.txt (optional but recommended) concatenates the doc set into one Markdown file for low-latency context loading.
MCP retrieval endpoint exposed when docs back a product an agent will operate (Anthropic, Cursor, Mintlify, and others ship MCP retrieval). MCP gives agents a live tool, not a static dump.
Sitemap.xml + RSS feed with lastmod reflecting actual content changes — not deploy timestamps. Agents downgrade pages that look like noise.
robots.txt allows GPTBot, ClaudeBot, PerplexityBot, Googlebot, CCBot unless content is licensed. Do not silently block AI crawlers.
Canonical URL on every page ( and canonical_url in any structured metadata). Prevents duplicate-page penalties.

Category 3 — Tool-use signals (release gate: ≤7 pass of 8)

Every public API endpoint has an OpenAPI 3.x or JSON Schema definition linked from the page. Agents call OpenAPI specs via function calling.
Tool / function descriptions are imperative, complete sentences ("Returns the current account balance for the authenticated user"). Vague labels ("Account") fail.
Parameters list type, required, default, enum, and example for every field. See our function calling documentation spec.
Examples are runnable, not pseudocode. Curl + at least one SDK example per endpoint, copy-paste-ready.
Idempotency is explicit. Mark every endpoint with idempotent: true|false so agents know whether they can safely retry.
Side-effects are spelled out. Any endpoint that writes, deletes, charges, or sends notifications says so in the first sentence.
Authentication scopes named. State exactly which scope or token is required for each endpoint.
Rate limits documented per tier with header names (X-RateLimit-Limit, X-RateLimit-Remaining) and 429 retry guidance.

Category 4 — Error states and self-repair (release gate: ≤5 pass of 6)

Every documented endpoint lists 4xx and 5xx error codes with name, HTTP status, condition, and recommended next action. See agent error handling documentation spec.
Validation errors include field names and machine-readable codes. "Invalid input" is not enough; agents need the field path and error code.
Retryable vs terminal errors marked. Agents must know which 5xx responses warrant retries vs escalations.
Hint phrasing for recovery. When the fix is non-obvious ("contact support"), name a specific docs link or a runnable diagnostic.
Common pitfalls section lists the top three mistakes agents make calling this API, with the right pattern.
Deprecation notices include sunset date and replacement endpoint. Agents should never call a deprecated endpoint without a documented next step.

Category 5 — Governance and trust (release gate: ≤4 pass of 5)

Page-level last_reviewed_at and version visible in the page metadata, not just commit history. Agents weight freshness.
Authorship trail — a named author or team with credentials/sameAs links. Salesforce and OWASP both list provenance as a top governance signal for agentic apps.
License terms for content reuse (CC-BY, MIT, proprietary). Agents may refuse to cite content with unclear licenses.
Do-not-train flags (e.g., noai, noimageai) used only intentionally. Default-on training blocks remove you from the citation pool.
Change log linked from the page or doc set. A stable change-log URL helps agents reconcile prior call patterns with new behavior.

Category 6 — Verification gates (release gate: ≤3 pass of 4)

Schema validators pass — Schema Markup Validator, Google Rich Results Test, OpenAPI lint (Spectral or Redocly).
Anchor diff vs prior version is reviewed. No silently renamed anchors. Renames should be tracked as redirects.
Agent dry-run executed. A scripted Claude / ChatGPT / Cursor agent attempts the top three workflows on the page; failures block release.
A 24-hour no-regress monitor watches for new 4xx/5xx spikes from agent traffic post-publish; rollback plan exists.

Quick scoring template

Category	Items	Pass threshold
Structure & parseability	Q1-6	≥5 / 6
Retrieval & discovery	Q7-12	≥5 / 6
Tool-use signals	Q13-20	≥7 / 8
Error states & self-repair	Q21-26	≥5 / 6
Governance & trust	Q27-31	≥4 / 5
Verification gates	Q32-35	≥3 / 4

All six categories must clear their threshold to release.

Hard fails (do not ship under any circumstance)

Q11 fails (you are silently blocking AI crawlers but advertising agent-readiness).
Q13 fails on a public API page (no OpenAPI / JSON Schema = unusable for function calling).
Q21 fails on any endpoint that can write or charge.
Q33 fails (silent anchor renames break every existing agent integration).

What to do if you fail

Failures in Structure or Retrieval — fix mechanically. These are nearly always template / pipeline issues.
Failures in Tool-use signals — push back to API owners. The fix is in the OpenAPI source, not the docs render.
Failures in Error states — schedule a paired session with the engineer who owns the endpoint; this category has the highest hidden-debt count in audits.
Failures in Governance — escalate to a docs lead; license and authorship policy is a doc-set-wide decision, not page-by-page.

FAQ

Q: How is this different from standard API doc QA?

Standard API doc QA optimizes for human developers reading sequentially. Agent-readiness adds three new failure modes: anchor instability, missing machine-readable error codes, and lack of MCP / llms.txt retrieval. A page can pass classic doc QA and still be invisible to autonomous agents.

Q: Do I need both llms.txt and an MCP server?

For static doc sets, llms.txt is enough. If your docs back a product where agents take live actions (search, call tools, fetch latest data), add an MCP retrieval endpoint. They serve different layers: llms.txt is a sitemap for LLMs, MCP is a live tool surface.

Q: Should I block AI crawlers if my content is licensed?

Block only the specific bots you have a contract reason to block (training vs answer use can differ; see Anthropic's ClaudeBot-Train vs ClaudeBot). Default-block reduces your visibility without legal benefit.

Q: How often should I rerun this checklist?

Run the structure, tool-use, and error-state items on every page release. Run governance and verification on every doc-set release (typically quarterly). Run a full audit when an upstream model adds new agent capabilities (for example, ChatGPT workspace agents shipping in April 2026).

Q: Can I automate this checklist?

Q1-9, Q11-12, Q15-16, Q20, Q22-23, Q27, Q32, and Q34 are automatable in CI today. The rest still need a human pass for judgment, especially the prose quality of error hints and tool descriptions.