Pre-Launch GEO Readiness Checklist: 30 Citation Signals to Verify Before Product Launches
A pre-launch GEO readiness checklist verifies 30 atomic citation signals across four pillars — entity coverage, JSON-LD schema, llms.txt, and pressroom assets — before a product goes live. Completing all 30 makes the launch page extractable, attributable, and quotable by ChatGPT, Perplexity, Google AI Overviews, Claude, and Copilot on day one.
TL;DR
Legacy SEO launch checklists ignore the signals that determine whether AI engines will cite your launch page. This 30-item pre-launch GEO readiness checklist groups every must-verify control into four pillars — entity, schema, llms.txt, and pressroom — and is designed to be worked through in the 48 hours before go-live. Treat each item as binary pass/fail; ship only when all 30 are green.
When to use this checklist
- 48-72 hours before a product, feature, or pricing launch.
- During an embargo window when press kits are finalized.
- After a rebrand, rename, or domain migration that changes any canonical entity.
- Before a funding announcement, when AI engines will be queried for company details within minutes of the press release.
If you are auditing an already-live page, run the citation readiness framework instead — it scores existing pages on the same signals.
How AI engines decide what to cite at launch
Generative engines select citations based on three properties of the source: it must be extractable (parseable to clean text), attributable (entity-resolvable to a known organization), and quotable (containing short, factual sentences a model can reuse verbatim). Most launch pages fail at least one of these the moment they ship — usually because schema is missing, llms.txt has not been updated, or the press kit lives behind a marketing CDN that AI crawlers cannot reach.
The 30 signals
Pillar 1 — Entity coverage (signals 1-8)
Entity coverage tells AI engines who is launching and what the launch is. Without it, your page may be summarized but the brand will be omitted from the citation.
- [ ] 1. Canonical entity name appears in
, H1, and the first 160 characters of body — no nicknames, no taglines in front of it. - [ ] 2. Brand mentioned in the first sentence so an extractor sees it before truncation.
- [ ] 3. Wikipedia and Wikidata entries exist for the company (and the product if it has its own page) and link to the launch domain via official website (P856).
- [ ] 4. Crunchbase, LinkedIn Company, and G2 profiles are live, consistent, and link to the canonical domain.
- [ ] 5. Aliases enumerated on the page: legal name, common name, ticker, internal codename — listed in plain text or in sameAs.
- [ ] 6. Founders, CEO, or product lead named with role so AI engines can resolve a person-to-org edge.
- [ ] 7. Headquarters or operating geography stated in plain text near the top of the page.
- [ ] 8. One-sentence canonical description (≤200 chars) appears verbatim on the launch page, the homepage, and the press kit. This is the sentence you want quoted.
Pillar 2 — Schema and structured data (signals 9-17)
Schema markup is the most consistently recommended GEO control because it lets AI engines verify entity facts without parsing prose.
- [ ] 9. JSON-LD Product (or SoftwareApplication) is present in , validates against schema.org/Product, and includes name, description, brand, offers, and image.
- [ ] 10. JSON-LD Organization sitewide with name, url, logo, sameAs[] (Wikipedia, Wikidata, LinkedIn, X), and foundingDate.
- [ ] 11. JSON-LD BreadcrumbList reflects the URL hierarchy of the launch page.
- [ ] 12. JSON-LD FAQPage wraps any on-page Q&A block.
- [ ] 13. JSON-LD WebPage with datePublished and dateModified matches launch day, not a stale CMS default.
- [ ] 14. Offer block has price, currency, and availability (or PriceSpecification for tiered SaaS) — even a free tier should declare price: 0.
- [ ] 15. Markup validates in Google's Rich Results Test and the schema.org validator with zero errors.
- [ ] 16. Open Graph and Twitter Card are set with launch-specific og:title, og:description, and a 1200×630 image — reused by AI summarizers when JSON-LD is sparse.
- [ ] 17. points to the production URL, not a staging or campaign-tracked URL.
Pillar 3 — llms.txt and crawler controls (signals 18-24)
The /llms.txt standard, proposed by Jeremy Howard in September 2024, gives AI systems a curated map of your most-citable content. Adoption by LLM-specific bots is uneven, but launch pages should ship with it because human-driven AI tools (ChatGPT browse, Perplexity, Claude with web) dereference it on demand.
- [ ] 18. /llms.txt exists at the domain root and lists the launch page in a top-level section with a one-line description.
- [ ] 19. /llms-full.txt (optional) includes the full markdown body of the launch page for engines that prefer pre-extracted text.
- [ ] 20. A .md mirror of the launch page is reachable at
.md per the llms.txt proposal. - [ ] 21. robots.txt does not block GPTBot, PerplexityBot, ClaudeBot, Google-Extended, or OAI-SearchBot unless that is a deliberate policy.
- [ ] 22. on the launch page does not include noindex, noai, or noimageai.
- [ ] 23. XML sitemap updated with the launch URL and a fresh
matching launch day. Segment by content type if the site exceeds 20,000 URLs. - [ ] 24. CDN and WAF rules whitelist AI crawler user-agents so they do not get 403'd at the edge.
Pillar 4 — Pressroom and freshness (signals 25-30)
AI engines bias toward sources that look like newsrooms: dated, authored, with an obvious provenance trail.
- [ ] 25. Press release published on the same domain (or a subdomain owned by the brand) and linked from the launch page. Avoid third-party-only PR distribution.
- [ ] 26. /press or /newsroom index lists the launch with date, headline, and a one-paragraph summary in plain HTML — no JS-only render.
- [ ] 27. Media kit with logos, screenshots, and boilerplate is downloadable without a form gate; AI crawlers cannot fill forms.
- [ ] 28. Author byline and datePublished are visible on the launch page, not hidden behind a CMS toggle.
- [ ] 29. Three or more independent sources (analyst note, partner blog, journalist coverage) confirm the launch within 24 hours, ideally citing the canonical sentence from signal 8.
- [ ] 30. Internal cross-links from the homepage, product hub, and changelog point to the launch page with descriptive anchor text — the strongest internal entity signal AI engines pick up.
Roles and ownership
| Pillar | Primary owner | Reviewer |
|---|---|---|
| Entity coverage | Brand / Product Marketing | Founder or CEO |
| Schema and structured data | Web engineering | SEO lead |
| llms.txt and crawler controls | Developer relations or platform | SEO lead |
| Pressroom and freshness | Communications | Product Marketing |
Assign one DRI per pillar. Each DRI signs off in the launch tracker before the page goes public.
Common failure modes at launch
- JSON-LD copied from a template with placeholder name or brand values. Always validate with the launch-day URL.
- llms.txt listing a staging path because the file was generated before the production URL was finalized.
- Press kit hidden behind a HubSpot or Marketo form that AI crawlers cannot fill, so logos and boilerplate never enter the citation graph.
- dateModified left at a CMS default months in the past, causing AI engines to treat the page as stale.
- Canonical sentence rewritten by a copyeditor at the last minute, breaking signal 8 consistency across the site, press release, and media kit.
FAQ
Q: How long before launch should I run this checklist?
Start 72 hours before launch and aim for all 30 signals green 24 hours before. Schema and llms.txt edits often need a deploy and a CDN purge, so leaving them to launch day risks a stale cache when AI crawlers first hit the URL.
Q: Is llms.txt actually read by AI engines today?
Adoption by autonomous LLM crawlers (GPTBot, ClaudeBot, PerplexityBot) remains uneven as of 2026. However, on-demand AI tools — ChatGPT browse, Perplexity, Claude with web access, and Copilot — frequently dereference /llms.txt and the .md mirror when a user asks about a specific URL or brand. Shipping it costs little and pays off whenever a user prompts an AI engine about your launch.
Q: Do I need both JSON-LD Product and Organization schemas?
Yes. Product describes the launch artifact; Organization resolves the brand entity sitewide. Without Organization, AI engines may extract product facts but fail to attribute them to your company, and your brand will be missing from the citation.
Q: What is the single highest-impact signal if I can only fix one?
Signal 8 — the one-sentence canonical description used verbatim on the launch page, homepage, and press kit. It is the sentence AI engines are most likely to quote, and consistency across surfaces is what tells them it is canonical.
Q: Should I block AI crawlers if I am worried about training?
That is a separate decision from launch readiness. If you block GPTBot or ClaudeBot in robots.txt, you should expect reduced citation in those engines. The pre-launch checklist assumes you want maximum citation; adjust signals 21 and 22 according to your AI-content policy.
Sources verified
- llms.txt proposal — https://llmstxt.org/ — verified 2026-04-29
- schema.org Product type — https://schema.org/Product — verified 2026-04-29
- Salesforce GEO best practices — https://www.salesforce.com/blog/generative-engine-optimization/ — verified 2026-04-29
- Search Engine Land schema-for-AI analysis — https://searchengineland.com/schema-markup-ai-search-no-hype-472339 — verified 2026-04-29
- Directive Consulting GEO best-practices checklist (llms.txt + sitemap segmentation) — https://directiveconsulting.com/blog/a-guide-to-generative-engine-optimization-geo-best-practices/ — verified 2026-04-29
- Longato AEM 1,000-domain llms.txt audit (adoption data) — https://www.longato.ch/llms-recommendation-2025-august/ — verified 2026-04-29
Related Articles
AI Citation Crisis Response Checklist: 20 Steps When ChatGPT or AI Overviews Stop Citing Your Brand
20-step crisis response checklist for diagnosing and reversing sudden AI citation drops in ChatGPT, Perplexity, and AI Overviews within 30 days.
AI Citation Forecasting Framework: Modeling Citation Lift Before You Publish
AI citation forecasting framework predicts how new content will lift LLM citations using entity coverage, intent fit, and competitor source overlap.
AI citation forecasting: how to estimate which pages will get cited
A scoring framework to forecast which pages AI search engines will cite, based on intent fit, authority, evidence density, and structure quality.