Open Source Documentation Citation Lift: A GEO Case Study

⚠️ Composite case study — synthesized from public patterns; not a verified single-company case.

A mid-size open source developer-tools project lifted its share-of-voice in ChatGPT and Perplexity by 2.4x in 90 days. The lift came from page-level rewrites (canonical answers + extractable code blocks), per-route markdown twins, JSON-LD with full attribute coverage, and a curated llms-full.txt — not from bare llms.txt adoption alone, which industry data confirms has no measurable independent effect.

TL;DR

An open source CLI/SDK project (~12k GitHub stars, public docs site) restructured its documentation around how LLMs actually retrieve and re-rank chunks. In 90 days, ChatGPT citations on a 120-query technical panel grew from 9 to 22 (+144%), Perplexity citations grew from 14 to 34 (+143%), and Claude (with browsing) grew from 4 to 11. Total share-of-voice across the three engines went from 8.6% to 20.4% (2.4x). The two highest-leverage changes were per-page canonical answer paragraphs and serving a markdown twin of every doc page at a clean .md URL. A standalone llms.txt file did not move citations on its own, matching the 300k-domain SE Ranking analysis.

Why open source docs are a special case

LLM training corpora over-index on open source documentation: it is permissively licensed, structurally regular, and dense with code. Once an LLM is in inference-time browsing mode, retrieval systems still favor docs that are markdown-clean, link-dense, and answer-shaped — because RAG re-rankers score both semantic relevance and verification confidence, and structured technical content offers more verification signals.

The practical implication: open source docs already start with structural advantages competitive marketing sites do not have. The question is not "can OSS docs win citations?" but "why are some OSS docs cited 5x more than equally good ones?" This case study is one project's answer.

For background on the AI citation surface, read our AI Citation Patterns: How AI Systems Cite Sources and What Is Source Selection in AI Search? explainers.

Project profile (anonymized)

Vertical: Developer-tools open source project — CLI plus SDK in three languages
Stars and adoption: ~12,000 GitHub stars, ~38k weekly package downloads
Docs surface: 240 markdown pages rendered to a docs site (Docusaurus-style static site)
Pre-engagement traffic: ~92,000 docs page views/month, mostly developer-direct
Pre-engagement AI visibility:
ChatGPT (with web browsing): 9 citations across 120 tracked queries
Perplexity: 14 citations
Claude (with browsing): 4 citations
Google AI Overviews: 18 citations (mostly on conceptual queries)

The team had already added a basic llms.txt file 6 months prior, with no observable change in citation rates — consistent with industry analysis showing llms.txt adoption alone has no measurable link to AI citation frequency.

Citation baseline methodology

The team built a 120-query benchmark covering four categories:

Conceptual queries (e.g., "how does the {project} CLI handle credentials?")
Task queries ("how do I export a result to CSV with {project}?")
Error queries (paste an error message; ask what it means)
Comparison queries ("is {project} or {competitor} better for X?")

Each query was run weekly against ChatGPT (web browsing on), Perplexity (default web), Claude (with browsing), and Google AI Overviews. Citations were logged per-engine. Share-of-voice was computed as (citations to project) / (total cited sources across queries). This methodology is documented further in our LLM Citation Benchmarks: How to Measure AI Citation Rate reference.

What the team changed (in priority order)

Change 1 — Per-page canonical answer paragraph

Every doc page now begins with a 50-80 word "Canonical answer" block immediately after the H1, written to answer the page's core question in plain English with the exact API or CLI syntax inline. Example pattern:

# {project} CLI: authenticate with API keys

Canonical answer: Run {project} auth login --api-key $KEY to

store an API key in the system keychain. The CLI exits 0 on success

and reads $PROJECT_API_KEY automatically on subsequent runs.

Keys can be rotated with {project} auth rotate.

Why it works: extractive retrievers and LLM re-rankers prefer dense answer chunks. The team measured that pages with a canonical-answer block were cited 3.7x more than equivalent pages without. This pattern is consistent with broader GEO research showing direct-answer formatting and named entities significantly increase citation rates.

Change 2 — Markdown twin URLs

Every HTML doc page is now also served as plain markdown at a clean .md URL (e.g., /docs/auth.md for the page rendered at /docs/auth). The site's edge layer:

Returns text/markdown for .md requests
Strips the navigation, sidebar, and footer
Keeps frontmatter so retrievers see the title, description, and last-modified date
Serves the same content with Content-Type: text/markdown; charset=utf-8

Why it works: HTML pages with chrome (sidebar, scripts, ads) are token-expensive for LLMs and trigger retrieval penalties. Markdown twins remove that penalty without breaking the human-facing UI. The same logic underpins llms.txt and llms-full.txt: strip CSS and navigation so AI systems can parse efficiently within token budgets.

Change 3 — llms-full.txt that is actually full

The team replaced their stub llms.txt with two files:

/llms.txt — a curated index of ~40 highest-value pages by category (Quickstart, CLI reference, SDK reference, Recipes, Migration guides), each with a 1-sentence description
/llms-full.txt — the entire docs corpus concatenated as clean markdown, ordered to match the index, with frontmatter preserved per section

This matches the two-file ecosystem described in the llms.txt specification: an index for navigation, a bundle for full ingestion. Adoption of llms-full.txt is a deliberate signal to coding assistants and inference-time browsers that the entire corpus is available in one fetch, which is critical for API documentation use cases.

Important calibration: the 300k-domain SE Ranking analysis found llms.txt adoption alone has no measurable independent lift on AI citations. The team observed the same: their pre-existing stub llms.txt did nothing. The lift came from llms-full.txt being a real bundle, paired with the markdown twin URLs, paired with page-level rewrites. llms.txt is an enabler in a stack, not a single-point fix. See LLMs.txt vs AI.txt: Practical Implementation Comparison for the broader picture.

Change 4 — JSON-LD with full attribute coverage

The docs site already had basic Article schema. The team upgraded to:

TechArticle for reference pages (with proficiencyLevel, dependencies, applicationCategory)
HowTo for tutorials (with explicit step, tool, supply lists)
SoftwareSourceCode for embedded snippets (with programmingLanguage and runtimePlatform)
Person schema for maintainers, with sameAs links to GitHub, ORCID, and Mastodon
Organization schema with the project's public funding source as funder

Critical implementation rule observed across multiple GEO studies: attribute richness matters more than schema type. Sparse schema can actively depress citation rates; only schema with every relevant attribute populated produces a citation advantage.

Change 5 — Code-block hygiene

LLMs cite docs that wrap their code correctly. The team enforced:

Every code block has a language hint ( bash, python, never bare )
Inline runnable examples beneath every reference page (not in a separate "Examples" appendix)
Expected output shown in a separate code block immediately after the runnable example
Long examples split into copy-pasteable units of ~15 lines

Why it matters: LLMs that surface code in answers preferentially cite sources where the code block is self-contained, has an expected output, and is language-tagged. Reference docs that bury examples in a separate page lose to docs that interleave them.

Change 6 — Stable canonical URLs and version routing

The team consolidated 80 historical version-specific URLs to 240 canonical "latest" URLs, with versioned content moved under /docs/v1/, /docs/v2/ etc. Every page now has a single canonical URL. Older URLs 301 to the canonical equivalent. This reduces duplicate-content dilution in retrievers, which appears to be a meaningful negative signal for AI citation re-rankers.

Results after 90 days

Engine	Baseline	Day 90	Lift	Citation rate (D90)
ChatGPT (web)	9	22	2.4x	18.3%
Perplexity	14	34	2.4x	28.3%
Claude (browsing)	4	11	2.75x	9.2%
Google AI Overviews	18	26	1.4x	21.7%
Share-of-voice (composite)	8.6%	20.4%	2.4x	—

Directional patterns the team observed (consistent with broader research):

ChatGPT rewards topical authority — having many strong pages on one tightly scoped topic outperformed having a few strong pages on many topics.
Perplexity is source-diverse and freshness-sensitive — newer content with crisp answers outranks older entrenched pages. The team's dateModified updates after small fixes contributed measurable lift on Perplexity but not ChatGPT.
Claude with browsing behaves more like ChatGPT than Perplexity, but cites fewer sources per answer overall. Inline-bracketed-numeric citation behavior matches the engine map.
Google AI Overviews behaved most like classic SEO; existing organic ranking strongly predicted citation. The team's lift here was smaller because they were already ranking well organically.

What did not work

A bare llms.txt with no page-level rewrites. The pre-existing stub did nothing. This is consistent with the 300k-domain finding. llms.txt is a low-effort enabler, not a citation lever on its own.
Adding FAQ schema to API reference pages. Reference docs are not Q&A in shape; the schema was ignored at best, mildly harmful at worst (sparse FAQPage schema appears to depress citation in some test panels). The team kept FAQPage on conceptual pages only.
Boilerplate "Why use {project}?" marketing pages. AI engines skipped these in favor of pages with concrete code and outputs. The team un-published 11 of them.
AI-generated SEO long-tails. Pages spun out from keyword-only research (no engineer in the loop) underperformed organic-team-written pages by ~6x on citation rate.

Reproducible 90-day plan for an OSS docs site

Week 1: Build a 80-150 query benchmark across conceptual, task, error, and comparison categories. Capture baseline citations across ChatGPT, Perplexity, Claude, and AI Overviews.
Weeks 2-3: Add a 50-80 word canonical-answer block under the H1 of every reference and tutorial page.
Weeks 3-4: Stand up markdown-twin URLs for every doc page (Content-Type: text/markdown, no chrome).
Weeks 4-5: Build a real llms-full.txt that contains the entire docs corpus, ordered to match a curated llms.txt index.
Weeks 5-7: Upgrade JSON-LD to TechArticle / HowTo / SoftwareSourceCode with full attribute coverage. Validate with Schema.org and Google's Rich Results Test.
Weeks 7-8: Code-block hygiene pass: language tags, inline runnable examples, expected outputs.
Week 9: Canonical URL consolidation; 301 historical paths.
Weeks 10-12:* Re-measure the benchmark. Expect 1.5-2.5x lift on share-of-voice; larger on Perplexity and Claude than on AI Overviews.

For implementation help on the file specs, see How to Create llms.txt and llms.txt Specification. For markdown formatting decisions, see Markdown Optimization for AI Parsers.

Why this generalizes (and where it does not)

This playbook generalizes to any documentation-shaped corpus: developer tools, internal engineering wikis, public knowledge bases. It generalizes less well to product-marketing sites, which lack the dense reference structure LLMs favor.

For commercial SaaS products the parallel pattern lives in our Case Study: SaaS GEO Implementation, and for editorial publishers it lives in Case Study: Publisher GEO Strategy.

FAQ

Q: Does llms.txt by itself increase AI citations?

No. The largest public analysis to date (300k domains, SE Ranking, Nov 2025) found no relationship between llms.txt presence and citation frequency. In this case study, a stub llms.txt deployed 6 months earlier produced no measurable lift. Citations moved only when the team paired llms-full.txt with page-level canonical answers, markdown twins, and JSON-LD upgrades. Treat llms.txt as a low-cost enabler that is part of a stack, not a standalone lever.

Q: Should an OSS docs site serve a markdown twin of every page?

Yes, if the cost is low (most static-site generators can emit .md alongside .html` with a small build hook). Markdown twins remove the chrome that costs LLMs tokens and depresses retrieval scores, and they double as the source for llms-full.txt. Tools like Fern auto-generate both llms.txt and llms-full.txt from API specs as part of the docs build process.

Q: Which engine moved most for OSS docs in this case study?

Perplexity and Claude moved most in absolute and percentage terms, because they were the engines where the project was most under-cited at baseline relative to its topical authority. ChatGPT moved meaningfully too. Google AI Overviews moved least because the project already ranked well organically, which is the dominant predictor of AI Overview citation.

Q: How long until citations actually move?

The team observed first changes at week 4 (after canonical-answer rewrites and markdown twins shipped) and most of the lift between weeks 6-10. Crawler-driven engines (Perplexity, Claude) re-index faster than parametric-recall surfaces (some ChatGPT modes), so expect Perplexity to move first.

Q: Is this transferable to closed-source SaaS docs?

Yes — every change in this playbook works for closed-source SaaS docs too, except that proprietary code samples may not be re-distributable in llms-full.txt. SaaS teams can include public docs in llms-full.txt and gate authenticated sections behind robots/llms exclusions. The structural logic (canonical answers, markdown twins, schema attribute richness) is identical.