Lighthouse for GEO: Performance and Quality Auditing
Google Lighthouse audits performance, accessibility, SEO, and best practices using lab data. For GEO, focus on the SEO and accessibility categories plus the 2026 Core Web Vitals (LCP, INP, CLS) and run it in CI rather than treating any single score as a ranking guarantee.
TL;DR: Lighthouse is a lab-quality auditor, not a citation-ranking signal. Use it to enforce structural fundamentals — valid HTML, accessible markup, fast loads, complete metadata — because those fundamentals are what AI extractors actually rely on. Track the 2026 Core Web Vitals (LCP, INP, CLS) as field metrics in CrUX, not just lab scores in Lighthouse, and wire both into CI.
What Lighthouse Actually Measures
Lighthouse is an open-source auditor maintained by the Chrome team. It runs against any URL and produces a 0-100 score per category plus a list of specific audits. It is available in Chrome DevTools, the lighthouse CLI, the Node module, PageSpeed Insights, and Lighthouse CI for build pipelines.
The categories are:
| Category | What it measures | GEO relevance |
|---|---|---|
| Performance | Synthetic load behavior across LCP, INP estimates, CLS, TBT, Speed Index | Crawl efficiency and template robustness |
| Accessibility | axe-core rules: alt text, ARIA, contrast, semantic landmarks | Direct correlate of content extractability |
| SEO | Meta tags, heading hierarchy, crawlability, link text quality | Strongest GEO signal in Lighthouse |
| Best Practices | HTTPS, console errors, deprecated APIs | Trust signals for crawlers |
| PWA (optional) | Manifest, service worker, installability | Low GEO relevance |
Two important caveats before you treat any score as a ranking promise:
- Google does not use Lighthouse scores directly for search ranking. It uses CrUX field data, which is real-user measurement, not synthetic.
- No major AI provider has publicly tied Lighthouse scores to AI citation behavior. What AI extractors clearly do care about is clean HTML, fast server responses, and accurate metadata — the underlying signals Lighthouse surfaces, not the score itself.
This is why a Lighthouse audit is a quality gate, not a leaderboard. Use it to catch regressions, not to chase 100s.
The 2026 Core Web Vitals
Core Web Vitals are the field-measured user-experience metrics Google uses for the ranking-relevant signal. As of 12 March 2024, INP replaced FID and the trio is now:
| Metric | What it measures | Good threshold |
|---|---|---|
| LCP (Largest Contentful Paint) | Time until the largest visible element renders | ≤ 2.5 s |
| INP (Interaction to Next Paint) | Responsiveness of the slowest user interaction | ≤ 200 ms |
| CLS (Cumulative Layout Shift) | Visual stability over the page lifecycle | ≤ 0.1 |
Lighthouse estimates these in lab conditions; CrUX reports the 75th-percentile field value over a 28-day window. For GEO, treat lab numbers as a regression check and field numbers as the truth.
GEO-Relevant Audits, Category by Category
SEO category (highest GEO leverage)
The Lighthouse SEO audit is the closest thing to a structural readiness check for AI extractors. The audits worth treating as hard pass/fail in CI:
- meta-description — must be present, 120-160 characters
- document-title — must be present and unique per page
- heading-order — single H1, monotonic descent through H2/H3/H4
- link-text — no read more / click here anchors
- crawlable-anchors — anchors are real elements, not JS handlers
- is-crawlable — not blocked by robots
- hreflang and canonical — valid alternates and canonicals
Accessibility category
Accessibility and AI extractability share most of their failure modes. Missing alt text, missing form labels, low-contrast text, and incorrect ARIA all degrade both human and machine reading. Aim for 95+ here — the audits are largely binary, so a 95 means a small, fixable list.
Performance category
Slow servers and oversized payloads hurt every crawler, AI or otherwise. The audits that move the needle for AI retrieval specifically:
- server-response-time — keep TTFB under 600 ms; AI retrieval bots have aggressive timeouts
- render-blocking-resources — AI extractors often run with limited or no JavaScript
- unminified-javascript and unused-javascript — reduce the surface area of JS the crawler may not execute
- uses-text-compression — brotli or gzip
LCP breaks down into TTFB, Resource Load Delay, Resource Load Time, and Render Delay. When LCP regresses, look at the phase breakdown in the Lighthouse report rather than the headline number.
Best Practices
HTTPS, no console errors, valid source maps, no deprecated APIs. Low effort, high baseline-trust impact.
Running a GEO Audit Manually
- Open Chrome DevTools (Cmd-Opt-I on macOS, F12 on Windows/Linux) on the target page.
- Switch to the Lighthouse tab.
- Select Performance, Accessibility, Best Practices, and SEO.
- Choose Mobile (most AI agent traffic and Google's primary index are mobile-equivalent).
- Use Navigation mode for cold-load metrics.
- Click Analyze page load.
- Repeat on at least three URLs per template (homepage, hub, leaf article) and on slow-network throttling.
For a quick CLI run:
npx lighthouse https://example.com/article-slug
--only-categories=performance,accessibility,seo,best-practices
--form-factor=mobile
--output=html,json
--output-path=./lhrLighthouse in CI
Manual audits are useful for diagnosis; CI is what keeps GEO regressions from shipping. The widely used integration is treosh/lighthouse-ci-action for GitHub Actions, which wraps Lighthouse CI v0.15.x and Lighthouse v12.6.
A minimal workflow that asserts thresholds and fails the build on regressions:
name: lighthouse
on: [pull_request]
jobs:
lhci:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- name: Run Lighthouse CI
uses: treosh/lighthouse-ci-action@v12
with:
urls: |
https://staging.example.com/
https://staging.example.com/geo/what-is-geo
configPath: ./.lighthouserc.json
uploadArtifacts: trueWith assertions in .lighthouserc.json:
{
"ci": {
"assert": {
"assertions": {
"categories:performance": ["warn", { "minScore": 0.8 }],
"categories:accessibility": ["error", { "minScore": 0.95 }],
"categories:seo": ["error", { "minScore": 0.95 }],
"categories:best-practices": ["error", { "minScore": 0.9 }]
}
}
}
}Keep performance as a warn, not an error — lab performance is noisy across runs; use CrUX or RUM to gate ship decisions on real-user data.
Common False Positives
- Third-party analytics killing performance. Tag managers, A/B test SDKs, and chat widgets routinely cost 20-30 performance points. Decide consciously whether the business value is worth it and document the trade-off.
- Lighthouse running on a 2016-equivalent phone with throttled 4G. It is meant to be a stress test. A 70 mobile / 95 desktop split is normal for content-heavy pages.
- Score variance run-to-run. ±5 points between runs is expected. Use median-of-five for any decision.
- Single-page apps reporting low SEO scores due to client-side routing. This is real, not a false positive: AI crawlers usually do not execute JavaScript, so SSR or static rendering is required for GEO.
FAQ
Q: Do AI providers use Lighthouse scores to decide which sites to cite?
No provider has publicly confirmed that. What AI extractors do care about — clean HTML, fast TTFB, semantic structure, accurate metadata — is what Lighthouse audits surface, so a clean Lighthouse report correlates with extractability without being a direct ranking input.
Q: Should I optimize for the lab Lighthouse score or CrUX field data?
Use Lighthouse to catch regressions in CI and CrUX to validate real-user experience. Google's ranking-relevant signal is field data; lab data is the canary that tells you something changed before users feel it.
Q: Lighthouse SEO score is 100 but I'm not getting AI citations. Why?
Lighthouse SEO covers structural readiness, not topical authority. A 100 means crawlers can read your page; it does not mean AI systems pick you as the best source. Pair Lighthouse with topical depth, internal linking, and structured data work.
Q: What target scores should we set in CI?
Reasonable defaults: SEO ≥ 95, Accessibility ≥ 95, Best Practices ≥ 90, Performance warn at 80. Tune by content template, not site-wide — hub pages and article pages have different floors.
Q: Is INP the same as the old FID?
No. FID measured only the delay before the first input was processed. INP captures the full interaction-to-paint cost across the worst-of-all interactions on the page, which is a far better proxy for perceived responsiveness. INP replaced FID on 12 March 2024.
Related Articles
AI Crawl Signals: How AI Discovers Content
Technical reference for the signals AI systems use to discover, access, and prioritize web content — including sitemaps, llms.txt, robots.txt, structured data, and HTTP headers.
HTML Semantic Structure for AI Readability
Use HTML5 semantic elements like article, section, nav, and proper heading hierarchy to improve AI crawler extraction and citation probability.
Screaming Frog for GEO Auditing
Use Screaming Frog SEO Spider to audit GEO readiness — heading hierarchy, JSON-LD schema, AI summary blocks, FAQ extractability, and internal linking at scale.