Lighthouse for GEO: Performance and Quality Auditing

Google Lighthouse audits performance, accessibility, SEO, and best practices using lab data. For GEO, focus on the SEO and accessibility categories plus the 2026 Core Web Vitals (LCP, INP, CLS) and run it in CI rather than treating any single score as a ranking guarantee.

TL;DR: Lighthouse is a lab-quality auditor, not a citation-ranking signal. Use it to enforce structural fundamentals — valid HTML, accessible markup, fast loads, complete metadata — because those fundamentals are what AI extractors actually rely on. Track the 2026 Core Web Vitals (LCP, INP, CLS) as field metrics in CrUX, not just lab scores in Lighthouse, and wire both into CI.

What Lighthouse Actually Measures

Lighthouse is an open-source auditor maintained by the Chrome team. It runs against any URL and produces a 0-100 score per category plus a list of specific audits. It is available in Chrome DevTools, the lighthouse CLI, the Node module, PageSpeed Insights, and Lighthouse CI for build pipelines.

The categories are:

Category	What it measures	GEO relevance
Performance	Synthetic load behavior across LCP, INP estimates, CLS, TBT, Speed Index	Crawl efficiency and template robustness
Accessibility	axe-core rules: alt text, ARIA, contrast, semantic landmarks	Direct correlate of content extractability
SEO	Meta tags, heading hierarchy, crawlability, link text quality	Strongest GEO signal in Lighthouse
Best Practices	HTTPS, console errors, deprecated APIs	Trust signals for crawlers
PWA (optional)	Manifest, service worker, installability	Low GEO relevance

Two important caveats before you treat any score as a ranking promise:

Google does not use Lighthouse scores directly for search ranking. It uses CrUX field data, which is real-user measurement, not synthetic.
No major AI provider has publicly tied Lighthouse scores to AI citation behavior. What AI extractors clearly do care about is clean HTML, fast server responses, and accurate metadata — the underlying signals Lighthouse surfaces, not the score itself.

This is why a Lighthouse audit is a quality gate, not a leaderboard. Use it to catch regressions, not to chase 100s.

The 2026 Core Web Vitals

Core Web Vitals are the field-measured user-experience metrics Google uses for the ranking-relevant signal. As of 12 March 2024, INP replaced FID and the trio is now:

Metric	What it measures	Good threshold
LCP (Largest Contentful Paint)	Time until the largest visible element renders	≤ 2.5 s
INP (Interaction to Next Paint)	Responsiveness of the slowest user interaction	≤ 200 ms
CLS (Cumulative Layout Shift)	Visual stability over the page lifecycle	≤ 0.1

Lighthouse estimates these in lab conditions; CrUX reports the 75th-percentile field value over a 28-day window. For GEO, treat lab numbers as a regression check and field numbers as the truth.

GEO-Relevant Audits, Category by Category

SEO category (highest GEO leverage)

The Lighthouse SEO audit is the closest thing to a structural readiness check for AI extractors. The audits worth treating as hard pass/fail in CI:

meta-description — must be present, 120-160 characters
document-title — must be present and unique per page
heading-order — single H1, monotonic descent through H2/H3/H4
link-text — no read more / click here anchors
crawlable-anchors — anchors are real elements, not JS handlers

is-crawlable — not blocked by robots

hreflang and canonical — valid alternates and canonicals

Accessibility category

Accessibility and AI extractability share most of their failure modes. Missing alt text, missing form labels, low-contrast text, and incorrect ARIA all degrade both human and machine reading. Aim for 95+ here — the audits are largely binary, so a 95 means a small, fixable list.

Performance category

Slow servers and oversized payloads hurt every crawler, AI or otherwise. The audits that move the needle for AI retrieval specifically:

server-response-time — keep TTFB under 600 ms; AI retrieval bots have aggressive timeouts
render-blocking-resources — AI extractors often run with limited or no JavaScript
unminified-javascript and unused-javascript — reduce the surface area of JS the crawler may not execute
uses-text-compression — brotli or gzip

LCP breaks down into TTFB, Resource Load Delay, Resource Load Time, and Render Delay. When LCP regresses, look at the phase breakdown in the Lighthouse report rather than the headline number.

Best Practices

HTTPS, no console errors, valid source maps, no deprecated APIs. Low effort, high baseline-trust impact.

Running a GEO Audit Manually

Open Chrome DevTools (Cmd-Opt-I on macOS, F12 on Windows/Linux) on the target page.
Switch to the Lighthouse tab.
Select Performance, Accessibility, Best Practices, and SEO.
Choose Mobile (most AI agent traffic and Google's primary index are mobile-equivalent).
Use Navigation mode for cold-load metrics.
Click Analyze page load.
Repeat on at least three URLs per template (homepage, hub, leaf article) and on slow-network throttling.

For a quick CLI run:

npx lighthouse https://example.com/article-slug 
  --only-categories=performance,accessibility,seo,best-practices 
  --form-factor=mobile 
  --output=html,json 
  --output-path=./lhr

Lighthouse in CI

Manual audits are useful for diagnosis; CI is what keeps GEO regressions from shipping. The widely used integration is treosh/lighthouse-ci-action for GitHub Actions, which wraps Lighthouse CI v0.15.x and Lighthouse v12.6.

A minimal workflow that asserts thresholds and fails the build on regressions:

name: lighthouse
on: [pull_request]
jobs:
  lhci:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
      - name: Run Lighthouse CI
        uses: treosh/lighthouse-ci-action@v12
        with:
          urls: |
            https://staging.example.com/
            https://staging.example.com/geo/what-is-geo
          configPath: ./.lighthouserc.json
          uploadArtifacts: true

With assertions in .lighthouserc.json:

{
  "ci": {
    "assert": {
      "assertions": {
        "categories:performance": ["warn", { "minScore": 0.8 }],
        "categories:accessibility": ["error", { "minScore": 0.95 }],
        "categories:seo": ["error", { "minScore": 0.95 }],
        "categories:best-practices": ["error", { "minScore": 0.9 }]
      }
    }
  }
}

Keep performance as a warn, not an error — lab performance is noisy across runs; use CrUX or RUM to gate ship decisions on real-user data.

Common False Positives

Third-party analytics killing performance. Tag managers, A/B test SDKs, and chat widgets routinely cost 20-30 performance points. Decide consciously whether the business value is worth it and document the trade-off.
Lighthouse running on a 2016-equivalent phone with throttled 4G. It is meant to be a stress test. A 70 mobile / 95 desktop split is normal for content-heavy pages.
Score variance run-to-run. ±5 points between runs is expected. Use median-of-five for any decision.
Single-page apps reporting low SEO scores due to client-side routing. This is real, not a false positive: AI crawlers usually do not execute JavaScript, so SSR or static rendering is required for GEO.

FAQ

Q: Do AI providers use Lighthouse scores to decide which sites to cite?

No provider has publicly confirmed that. What AI extractors do care about — clean HTML, fast TTFB, semantic structure, accurate metadata — is what Lighthouse audits surface, so a clean Lighthouse report correlates with extractability without being a direct ranking input.

Q: Should I optimize for the lab Lighthouse score or CrUX field data?

Use Lighthouse to catch regressions in CI and CrUX to validate real-user experience. Google's ranking-relevant signal is field data; lab data is the canary that tells you something changed before users feel it.

Q: Lighthouse SEO score is 100 but I'm not getting AI citations. Why?

Lighthouse SEO covers structural readiness, not topical authority. A 100 means crawlers can read your page; it does not mean AI systems pick you as the best source. Pair Lighthouse with topical depth, internal linking, and structured data work.

Q: What target scores should we set in CI?

Reasonable defaults: SEO ≥ 95, Accessibility ≥ 95, Best Practices ≥ 90, Performance warn at 80. Tune by content template, not site-wide — hub pages and article pages have different floors.

Q: Is INP the same as the old FID?

No. FID measured only the delay before the first input was processed. INP captures the full interaction-to-paint cost across the worst-of-all interactions on the page, which is a far better proxy for perceived responsiveness. INP replaced FID on 12 March 2024.