GEO for Developers: Technical Implementation Guide
GEO for developers is the implementation work that makes a site discoverable, parseable, and citable by AI search engines. The core deliverables are JSON-LD schema, an llms.txt file, semantic HTML, an AI-aware robots.txt, and automated validation in CI.
TL;DR: Ship five things to make your site GEO-ready: (1) a JSON-LD schema component on every content template, (2) an llms.txt file at the site root, (3) explicit allow rules in robots.txt for major AI crawlers, (4) a semantic HTML template with a single H1 and answer-first prose, and (5) CI checks that fail the build if structured data or heading hierarchy regresses. None of these replace SEO — they extend it.
Why GEO Is a Developer Problem
Most GEO advice is written for marketers, but the work that actually moves the needle lives in your build pipeline. AI assistants like ChatGPT, Claude, Perplexity, and Gemini retrieve content through crawlers and APIs, then synthesize answers from whatever they can parse cleanly. If your templates emit malformed schema, ship a single H1 inconsistently, or block AI user agents in robots.txt, your content is effectively invisible to those systems regardless of how well it ranks in Google.
GEO inherits the entire SEO stack — crawlability, performance, canonicalization, sitemaps — and adds a thin layer of AI-specific machine-readable hooks on top. Treat the AI-specific work as code: tests, types, and CI gates, not one-off marketing tasks.
Developer Checklist
Infrastructure (one-time)
- [ ] Deploy llms.txt at the site root
- [ ] Deploy ai.txt (or equivalent policy file) at the site root
- [ ] Configure robots.txt with explicit rules for AI crawlers
- [ ] Generate sitemap.xml automatically from your content source
- [ ] Implement canonical URL logic in your routing layer
- [ ] Add if you publish a feed
Per-page (template-level)
- [ ] JSON-LD structured data component injected into or
- [ ] Semantic HTML (
, , , - [ ] Single
enforced at the template level
- [ ]
and populated from frontmatter - [ ] Open Graph and Twitter Card metadata
- [ ] Last-Modified header derived from the content's update date
- [ ] Answer-first opening paragraph that defines the topic in one or two sentences
Build pipeline
- [ ] Auto-generate sitemap.xml and llms.txt on every build
- [ ] Validate JSON-LD against schema.org in CI
- [ ] Lint heading hierarchy (no skipped levels, exactly one H1)
- [ ] Block deploys when structured data tests fail
- [ ] Track per-page word count and warn on thin content
Implementation: Five Concrete Pieces
1. JSON-LD Schema Component
JSON-LD is the format Google recommends for structured data, and AI systems parse it the same way. Keep it in a single component so the contract is type-checked and the markup is identical across pages.
tsx
type ArticleSchemaProps = {
title: string
datePublished: string
dateModified: string
authorName: string
description: string
imageUrl: string
publisherName: string
publisherLogoUrl: string
}
export function ArticleSchema(props: ArticleSchemaProps) {
const schema = {
"@context": "https://schema.org",
"@type": "Article",
headline: props.title,
image: [props.imageUrl],
datePublished: props.datePublished,
dateModified: props.dateModified,
author: { "@type": "Person", name: props.authorName },
publisher: {
"@type": "Organization",
name: props.publisherName,
logo: { "@type": "ImageObject", url: props.publisherLogoUrl },
},
description: props.description,
}
return (