LLMs.txt Generator Tools: Evaluation checklist + best options (2026)

Use this checklist to evaluate llms.txt generators on sitemap coverage, llms-full.txt support, per-section descriptions, and build-pipeline integration. Mintlify, FireCrawl, and llmstxt.dev are the most-used 2026 options.

TL;DR

llms.txt is a markdown file that tells AI agents what your site contains, where the canonical content lives, and how to navigate it. A good generator (1) crawls your full sitemap, (2) emits both llms.txt and llms-full.txt, (3) lets you write per-section descriptions, and (4) regenerates on every deploy.

What is llms.txt?

llms.txt is a proposed standard hosted at /llms.txt that gives LLMs a curated map of a site's most-citable content. The companion llms-full.txt includes the actual markdown content of those pages so AI agents can ingest without crawling each URL. Adoption accelerated through 2025-2026 as Anthropic, Mintlify, and Vercel began shipping first-class support.

Evaluation checklist

Use the items below as a yes/no checklist when comparing generators.

Coverage

[ ] Reads your full sitemap.xml automatically
[ ] Supports manual page lists for sites without sitemaps
[ ] Excludes draft, archived, and noindex pages
[ ] Respects custom include/exclude rules

Output

[ ] Emits llms.txt (curated index)
[ ] Emits llms-full.txt (full markdown content)
[ ] Supports per-section descriptions in markdown
[ ] Preserves heading hierarchy and frontmatter

Build pipeline

[ ] CLI command for CI/CD
[ ] GitHub Action available
[ ] Vercel/Netlify integration
[ ] Build-fail option when content drops below a threshold

Quality controls

[ ] Word-count diff between runs
[ ] Validation against the llms.txt spec
[ ] Diff preview before commit
[ ] Ability to override page titles

Maintenance

[ ] Project actively maintained (commits in last 90 days)
[ ] Issue response within 30 days
[ ] Public roadmap or changelog

Hosting

[ ] Generator self-hosted or hosted SaaS
[ ] Output is plain text (no DRM, no auth)
[ ] CDN-friendly (no cookies, no dynamic params)

Best options in 2026

Mintlify

If your docs are on Mintlify, native llms.txt generation runs on every deploy. Coverage and llms-full.txt are first-class. Best fit for product docs and developer-experience teams.

FireCrawl

Open-source crawler that ingests any site (sitemap or seed URL) and outputs llms.txt + llms-full.txt. Best for teams whose primary content is not on a docs platform (marketing sites, blogs, knowledge bases).

llmstxt.dev

A hosted SaaS generator that runs on a schedule and exposes llms.txt/llms-full.txt over a CDN endpoint. Best for non-engineering content teams that need zero-code setup.

Vercel native (/llms.txt route)

If you ship Next.js on Vercel, you can colocate an llms.txt route generator with your build, regenerate per deploy, and avoid third-party dependencies.

Quality bar your output should meet

Top 100 most-citable pages, sorted by hub importance
Each entry: title, canonical URL, and 1-2 sentence description
Sections grouped by content type (# Guides, # References, # Case studies)
llms-full.txt must match llms.txt URL set 1:1
File size under 1MB for llms.txt; llms-full.txt may be larger

How to apply

Audit current site coverage and pick a generator that fits your stack.
Run the generator locally and review output against the checklist.
Wire it into CI/CD so llms.txt regenerates on every content change.
Submit llms.txt URL to Perplexity and Anthropic where supported.
Validate quarterly: re-crawl in llmstxt.dev validator or equivalent.

FAQ

Q: Is llms.txt a real standard?

It is a proposed convention with strong adoption in 2025-2026 (Mintlify, Anthropic, Vercel) but not yet a formal IETF standard. AI engines that respect it treat it as a hint, not a directive.

Q: Do I need both llms.txt and llms-full.txt?

For docs and knowledge bases, yes — llms-full.txt lets agents ingest content without crawling each URL. For marketing sites, llms.txt alone is often enough.

Q: Where do I put llms.txt?

At the root: https://yourdomain.com/llms.txt. Some publishers also expose /llms-full.txt and /.well-known/llms.txt for forward compatibility.

Q: Will llms.txt replace robots.txt?

No. robots.txt controls crawl behavior; llms.txt describes content for ingestion. They serve different layers of the AI agent stack.

Q: Do search engines read llms.txt?

Classical search engines do not use it for ranking. AI engines and AI agents may use it as a content map for retrieval and citation.