Agent-Ready Documentation Checklist: Pre-Publish Audit for Autonomous AI Agents
Before publishing docs that autonomous AI agents will read, walk this 35-point pre-publish checklist across six areas: structure, retrieval, tool-use signals, error states, governance, and verification. Failing two or more items in any single category is a release blocker.
TL;DR
Autonomous agents (Claude, ChatGPT workspace agents, Devin, Cursor, Kiro) read your docs the way a junior engineer with no Slack access does: only what is on the page. If your docs lack stable anchors, machine-readable tool descriptions, error states, and an explicit retrieval path, agents will either guess wrong or call your support team. This 35-point checklist is the release gate. Score each item pass/fail; ship only after all six categories clear the threshold.
How to use this checklist
Run this checklist on every doc page or doc-set release. Mark each item pass (✅) or fail (❌). For each category there is a release-gate threshold. If two or more items fail in any single category, do not publish until they are fixed. Tooling like Mintlify's LLM-optimization layer can automate parts of this, but the human checklist below is the auditable record.
For narrative context on agent-friendly docs, read our agent-friendly documentation guide. For the specification this checklist enforces, see AI agents content specification.
Category 1 — Structure and parseability (release gate: ≤5 pass of 6)
- Single H1 per page matching the page title verbatim. Agents anchor on the H1 to confirm context.
- Strict heading hierarchy (H1 → H2 → H3, no skips). LLMs lose document structure when levels skip.
- Stable, kebab-case anchor IDs on every H2 and H3. Anchors must not change across versions; agents cite by anchor.
- Plain Markdown source available at a stable URL (/page.md or Accept: text/markdown). HTML-only docs lose roughly 30% of token budget to layout.
- No critical content in screenshots or images. Anything an agent must read should also be in text. Add alt text for diagrams that supplements the prose.
- Code blocks tagged with language identifiers. Untagged blocks lose syntax context; agents handle bash, json, ts correctly only when fenced and labeled.
Category 2 — Retrieval and discovery (release gate: ≤5 pass of 6)
- llms.txt at the doc-site root lists canonical pages, API references, and tutorials. Validate against our llms.txt spec.
- llms-full.txt (optional but recommended) concatenates the doc set into one Markdown file for low-latency context loading.
- MCP retrieval endpoint exposed when docs back a product an agent will operate (Anthropic, Cursor, Mintlify, and others ship MCP retrieval). MCP gives agents a live tool, not a static dump.
- Sitemap.xml + RSS feed with lastmod reflecting actual content changes — not deploy timestamps. Agents downgrade pages that look like noise.
- robots.txt allows GPTBot, ClaudeBot, PerplexityBot, Googlebot, CCBot unless content is licensed. Do not silently block AI crawlers.
- Canonical URL on every page ( and canonical_url in any structured metadata). Prevents duplicate-page penalties.
Category 3 — Tool-use signals (release gate: ≤7 pass of 8)
- Every public API endpoint has an OpenAPI 3.x or JSON Schema definition linked from the page. Agents call OpenAPI specs via function calling.
- Tool / function descriptions are imperative, complete sentences ("Returns the current account balance for the authenticated user"). Vague labels ("Account") fail.
- Parameters list type, required, default, enum, and example for every field. See our function calling documentation spec.
- Examples are runnable, not pseudocode. Curl + at least one SDK example per endpoint, copy-paste-ready.
- Idempotency is explicit. Mark every endpoint with idempotent: true|false so agents know whether they can safely retry.
- Side-effects are spelled out. Any endpoint that writes, deletes, charges, or sends notifications says so in the first sentence.
- Authentication scopes named. State exactly which scope or token is required for each endpoint.
- Rate limits documented per tier with header names (X-RateLimit-Limit, X-RateLimit-Remaining) and 429 retry guidance.
Category 4 — Error states and self-repair (release gate: ≤5 pass of 6)
- Every documented endpoint lists 4xx and 5xx error codes with name, HTTP status, condition, and recommended next action. See agent error handling documentation spec.
- Validation errors include field names and machine-readable codes. "Invalid input" is not enough; agents need the field path and error code.
- Retryable vs terminal errors marked. Agents must know which 5xx responses warrant retries vs escalations.
- Hint phrasing for recovery. When the fix is non-obvious ("contact support"), name a specific docs link or a runnable diagnostic.
- Common pitfalls section lists the top three mistakes agents make calling this API, with the right pattern.
- Deprecation notices include sunset date and replacement endpoint. Agents should never call a deprecated endpoint without a documented next step.
Category 5 — Governance and trust (release gate: ≤4 pass of 5)
- Page-level last_reviewed_at and version visible in the page metadata, not just commit history. Agents weight freshness.
- Authorship trail — a named author or team with credentials/sameAs links. Salesforce and OWASP both list provenance as a top governance signal for agentic apps.
- License terms for content reuse (CC-BY, MIT, proprietary). Agents may refuse to cite content with unclear licenses.
- Do-not-train flags (e.g., noai, noimageai) used only intentionally. Default-on training blocks remove you from the citation pool.
- Change log linked from the page or doc set. A stable change-log URL helps agents reconcile prior call patterns with new behavior.
Category 6 — Verification gates (release gate: ≤3 pass of 4)
- Schema validators pass — Schema Markup Validator, Google Rich Results Test, OpenAPI lint (Spectral or Redocly).
- Anchor diff vs prior version is reviewed. No silently renamed anchors. Renames should be tracked as redirects.
- Agent dry-run executed. A scripted Claude / ChatGPT / Cursor agent attempts the top three workflows on the page; failures block release.
- A 24-hour no-regress monitor watches for new 4xx/5xx spikes from agent traffic post-publish; rollback plan exists.
Quick scoring template
| Category | Items | Pass threshold | Result |
|---|---|---|---|
| Structure & parseability | Q1-6 | ≥5 / 6 | |
| Retrieval & discovery | Q7-12 | ≥5 / 6 | |
| Tool-use signals | Q13-20 | ≥7 / 8 | |
| Error states & self-repair | Q21-26 | ≥5 / 6 | |
| Governance & trust | Q27-31 | ≥4 / 5 | |
| Verification gates | Q32-35 | ≥3 / 4 |
All six categories must clear their threshold to release.
Hard fails (do not ship under any circumstance)
- Q11 fails (you are silently blocking AI crawlers but advertising agent-readiness).
- Q13 fails on a public API page (no OpenAPI / JSON Schema = unusable for function calling).
- Q21 fails on any endpoint that can write or charge.
- Q33 fails (silent anchor renames break every existing agent integration).
What to do if you fail
- Failures in Structure or Retrieval — fix mechanically. These are nearly always template / pipeline issues.
- Failures in Tool-use signals — push back to API owners. The fix is in the OpenAPI source, not the docs render.
- Failures in Error states — schedule a paired session with the engineer who owns the endpoint; this category has the highest hidden-debt count in audits.
- Failures in Governance — escalate to a docs lead; license and authorship policy is a doc-set-wide decision, not page-by-page.
FAQ
Q: How is this different from standard API doc QA?
Standard API doc QA optimizes for human developers reading sequentially. Agent-readiness adds three new failure modes: anchor instability, missing machine-readable error codes, and lack of MCP / llms.txt retrieval. A page can pass classic doc QA and still be invisible to autonomous agents.
Q: Do I need both llms.txt and an MCP server?
For static doc sets, llms.txt is enough. If your docs back a product where agents take live actions (search, call tools, fetch latest data), add an MCP retrieval endpoint. They serve different layers: llms.txt is a sitemap for LLMs, MCP is a live tool surface.
Q: Should I block AI crawlers if my content is licensed?
Block only the specific bots you have a contract reason to block (training vs answer use can differ; see Anthropic's ClaudeBot-Train vs ClaudeBot). Default-block reduces your visibility without legal benefit.
Q: How often should I rerun this checklist?
Run the structure, tool-use, and error-state items on every page release. Run governance and verification on every doc-set release (typically quarterly). Run a full audit when an upstream model adds new agent capabilities (for example, ChatGPT workspace agents shipping in April 2026).
Q: Can I automate this checklist?
Q1-9, Q11-12, Q15-16, Q20, Q22-23, Q27, Q32, and Q34 are automatable in CI today. The rest still need a human pass for judgment, especially the prose quality of error hints and tool descriptions.
Related Articles
Agent Authentication Documentation Spec
Document authentication for autonomous agents: OAuth flows, API keys, scopes, error states, and consent UX patterns AI agents need to operate safely.
Agent Tool Manifest QA Checklist: Validating SKILL.md and Tool Schemas Before Agent Discovery
QA checklist for agent tool manifests: validate SKILL.md, JSON schemas, descriptions, and examples so agents discover and call tools correctly.
llms.txt generator: requirements, output format, and validation checklist
A reference spec for llms.txt generators: inputs, output format, URL rules, and the validation checklist for a machine-readable file.