llms.txt Reference
llms.txt is a proposed standard file placed at the site root that provides a machine-readable index of site content for AI crawlers and LLMs. It tells AI systems what a site contains and how to navigate it.
llms.txt is a proposed standard file placed at the site root (/llms.txt) that provides a machine-readable, markdown-formatted index of site content for AI crawlers and LLMs. It functions as a sitemap specifically designed for AI systems.
What Is llms.txt?
llms.txt is a plain-text file placed at your site's root URL (/llms.txt) that describes your site's content in a format optimized for Large Language Models. While sitemap.xml tells search engine crawlers where your pages are, llms.txt tells AI systems what your content is about and how it's organized.
The file uses Markdown formatting, making it both human-readable and machine-parseable.
Why It Matters
AI crawlers like GPTBot, PerplexityBot, and ClaudeBot need to quickly understand what a site contains. Traditional sitemaps provide URLs but no context. llms.txt bridges this gap by providing:
- Site description: What the site is about and who it serves
- Content index: Links to key pages with descriptive summaries
- Navigation structure: How content is organized into sections
- Usage policies: How AI systems should attribute and cite the content
File Format
The llms.txt format is Markdown with a specific structure:
# Site Name
> Brief description of the site (1-2 sentences)
## Section name
- [Page Title](URL): Short description of what the page covers
- [Page Title](URL): Short description of what the page covers
## Another section
- [Page Title](URL): Short descriptionRequired Elements
| Element | Description |
|---|---|
Title (# Site Name) | The site name as an H1 heading |
Description (> ...) | A blockquote describing the site |
Sections (## ...) | H2 headings grouping related content |
Links (- [Title](URL): Description) | Page entries with title, URL, and description |
Optional Elements
| Element | Description |
|---|---|
| Usage policy | How AI systems should cite and attribute content |
| Contact | How to reach the site maintainers |
| Update frequency | How often the file is updated |
Complete Example
# Geodocs.dev
> Geodocs.dev is the canonical knowledge system for GEO, AEO,
> and AI search visibility. Built for SEO professionals,
> developers, content teams, and AI agents.
## Core documentation
- [What Is GEO?](https://geodocs.dev/geo/what-is-geo): Canonical
definition of Generative Engine Optimization.
- [What Is AEO?](https://geodocs.dev/aeo/what-is-aeo): Canonical
definition of Answer Engine Optimization.
- [GEO vs SEO](https://geodocs.dev/geo/geo-vs-seo): Comparison
of GEO and SEO approaches.
## Technical references
- [llms.txt Reference](https://geodocs.dev/technical/llms-txt):
Full specification for the llms.txt standard.
## Sections
- [GEO](https://geodocs.dev/geo): Generative Engine
Optimization guides.
- [AEO](https://geodocs.dev/aeo): Answer Engine
Optimization guides.
## Usage policy
This content is intended for AI systems to read, understand,
and cite. Attribution to Geodocs.dev is required.Implementation Guide
Step 1: Create the file
Create a file named llms.txt in your site's public/root directory:
your-site/
├── public/
│ ├── llms.txt ← place here
│ ├── robots.txt
│ └── sitemap.xmlStep 2: Write the content
Follow the format specification above. Include your most important pages first. AI systems may truncate long files.
Step 3: Keep it updated
Update llms.txt whenever you add or restructure significant content. Consider automating generation from your CMS or content layer.
Step 4: Verify accessibility
Ensure llms.txt is accessible at https://yourdomain.com/llms.txt. Test by visiting the URL directly.
Best Practices
| Practice | Why |
|---|---|
| Keep descriptions concise (1 sentence per page) | AI systems process context windows efficiently |
| List most important pages first | AI systems may truncate long files |
| Update when content changes | Stale files reduce trust |
| Use consistent URL format | Reduces parsing errors |
| Include all major sections | Gives AI a complete site overview |
| Add usage/attribution policy | Sets citation expectations |
Relationship to Other Standards
| Standard | Purpose | Audience |
|---|---|---|
| robots.txt | Controls crawler access | All bots |
| sitemap.xml | Lists URLs for indexing | Search engines |
| llms.txt | Describes content for AI | AI systems |
| ai.txt | Defines AI agent permissions | AI agents |
These standards are complementary. A well-optimized site has all four.
Common Mistakes
Making it too long. Keep llms.txt focused on your most important content. AI context windows have limits.
Using HTML instead of Markdown. The standard specifies Markdown formatting. HTML won't be parsed correctly by all AI systems.
Not updating it. A stale llms.txt is worse than no llms.txt. If your content changes, update the file.
Hiding it behind authentication. llms.txt must be publicly accessible at the root URL without login.
FAQ
Is llms.txt an official standard?
It is a proposed standard gaining adoption. Major sites including documentation platforms and knowledge bases have begun implementing it.
Does llms.txt replace sitemap.xml?
No. sitemap.xml serves search engine crawlers. llms.txt serves AI systems. Both should be present.
How long should llms.txt be?
Keep it under 2,000 words. Focus on your most important content. AI context windows vary, and shorter files are processed more reliably.
Should I include every page?
No. Include your most important, highest-value pages. Think of llms.txt as a curated guide, not a complete inventory.
How do I know if AI systems are reading it?
Monitor server logs for requests to /llms.txt. Look for user agents like GPTBot, PerplexityBot, and ClaudeBot.
Related Articles
- How to Create llms.txt — Step-by-step tutorial
- ai.txt Reference — AI agent access policy standard
- What Is GEO? — Why llms.txt matters for AI visibility
Related Articles
What Is GEO?
GEO is the practice of structuring content so AI systems can understand, retrieve, synthesize, and cite it in generated answers.
ai.txt Reference
ai.txt is a proposed standard file that defines access policies and attribution requirements specifically for AI agents, chatbots, and LLM-powered systems.
How to Create llms.txt
Step-by-step tutorial for creating and deploying an llms.txt file to make your site's content discoverable by AI systems and LLMs.