Geodocs.dev

Lazy-Loading Impact on AI Crawlers: What Gets Indexed vs Skipped

ShareLinkedIn

Open this article in your favorite AI assistant for deeper analysis, summaries, or follow-up questions.

Lazy-loading defers content until a user (or simulated viewport) reaches it. Most AI search crawlers do not execute JavaScript, do not scroll, and do not simulate a viewport, so JavaScript-driven lazy-loading effectively hides that content from GPTBot, ClaudeBot, PerplexityBot, and OAI-SearchBot. Native HTML loading="lazy" is safe; client-side patterns require server-side rendering or HTML fallbacks.

TL;DR

Use native loading="lazy" for images and iframes — it is HTML-attribute based and visible to every AI crawler. Avoid JavaScript-only lazy-loading patterns (IntersectionObserver swapping data-src to src) on critical content unless you also server-side render the final HTML. Gemini and Google AI Overviews can render JS via Googlebot infrastructure; ChatGPT, Claude, and Perplexity crawlers cannot.

Per-crawler reference

CrawlerRenders JavaScriptSees native loading="lazy"Sees JS lazy-load (IntersectionObserver)Notes
GPTBot (training)NoYes (HTML attribute)NoFetches raw HTML only (Vercel, 2025)
OAI-SearchBotNoYesNoSame engine class as GPTBot
ChatGPT-UserNoYesNoUser-initiated fetch, still no JS execution
ClaudeBotNoYesNoAnthropic crawler, no JS rendering (ClickRank, 2026)
PerplexityBotNoYesNoPerplexity confirmed no JS execution
GooglebotYes (deferred)YesYes (after rendering)Two-pass index with full JS (Google, 2025)
GoogleOtherYesYesYesShares Googlebot infra
Google-ExtendedYesYesYesSame rendering pipeline
Gemini (via Google infra)YesYesYesInherits Googlebot rendering
BingbotLimitedYesPartialSome JS execution but less reliable than Googlebot
Bytespider (ByteDance)NoYesNoOften ignores robots.txt

Patterns and AI-crawler behavior

Native loading="lazy" (safe)

<img src="/hero.jpg" loading="lazy" alt="Annotated diagram of an AI search funnel" width="1200" height="600">

The src and alt attributes are present in the initial HTML response. Every crawler — JS-rendering or not — sees them (web.dev, 2025). Always include explicit width and height to avoid CLS.

IntersectionObserver swap (unsafe without SSR)

<img data-src="/hero.jpg" alt="..." class="lazy">
<script>/* swaps data-src → src on viewport entry */</script>

GPTBot, ClaudeBot, and PerplexityBot see only the data-src attribute. They do not extract the image. They also miss any text rendered into the DOM by client-side JS frameworks (React, Vue, Svelte) without server-side rendering (Passionfruit, 2026).

Background images via CSS

CSS-only background-image: url(...) is parsed by JS-rendering crawlers but ignored by those that only consume HTML. Use tags with loading="lazy" for any image whose alt text is meaningful for retrieval.

Below-the-fold text in tabs and accordions

If the markup is in the initial HTML (collapsed via CSS), AI crawlers see it. If it is fetched via XHR on click, only JS-rendering crawlers (Googlebot, Bingbot partial) see it.

Audit methodology

Use two curl commands to compare what AI crawlers see vs. what your browser renders:

curl -A "Mozilla/5.0 (compatible; GPTBot/1.0)" https://example.com/article -o ai.html
curl -A "Mozilla/5.0 (Chrome/120)" https://example.com/article -o human.html
diff <(grep -oE 'src="[^"]+"' ai.html | sort) 
     <(grep -oE 'src="[^"]+"' human.html | sort)

If the diff shows images present in human.html but missing in ai.html, those images are JS-loaded and invisible to non-rendering crawlers. Repeat with ClaudeBot, PerplexityBot, and Applebot user-agent strings; check server logs for cloaking detection rules first.

Audit checklist

  • [ ] Critical images use loading="lazy" HTML attribute, not data-src swaps.
  • [ ] All tags carry alt, width, and height.
  • [ ] Article body text is in the initial HTML response (server-side rendered if framework-based).
  • [ ] Structured data (JSON-LD) is in initial HTML, not injected by client-side JS.
  • [ ] Tabbed and accordion content is present in DOM at load, hidden via CSS.
  • [ ] Diff between AI-bot user-agent fetch and human fetch shows minimal content gap.

Common mistakes

  • Replacing with and a JS lazy library on hero or in-body images.
  • Hydrating React/Vue/Svelte client-side without an SSR or static export step.
  • Loading FAQ JSON-LD via fetch() after page load.
  • Using display: none on critical sections — some crawlers discount hidden content.
  • Trusting that "Googlebot sees it, so AI bots will too" — they will not.

FAQ

Q: Do GPTBot and ClaudeBot run JavaScript?

No. Independent log-file analysis confirms ChatGPT and Claude crawlers fetch JavaScript files but do not execute them, so any content rendered client-side is invisible (Vercel, 2025).

Q: Is native HTML loading="lazy" safe for AI crawlers?

Yes. The attribute is parsed at HTML level and the src and alt are present in the initial response. All crawlers see it.

Q: What about Gemini and Google AI Overviews?

They rely on Googlebot's rendering pipeline, which executes JavaScript in a deferred second pass. JS lazy-loading is mostly handled, but only if the DOM eventually settles and content enters the viewport simulation Googlebot performs.

Q: How do I make a React or Next.js site visible to AI crawlers?

Use server-side rendering (getServerSideProps, RSC streaming with proper HTML emission) or static export. The initial HTML response must contain the article body and key media references. ChatGPT and Claude bots will not wait for hydration.

Q: Should I block AI crawlers from images entirely?

Only if you have a specific licensing or rights reason. Blocking via robots.txt removes you from AI citation surfaces while still leaving you scrapable by non-compliant bots; consider a noai/noimageai strategy plus C2PA opt-out instead.

Related Articles

guide

404 Page AI Crawler Handling: Avoiding Citation Loss During Migrations

Migration playbook for keeping AI citations during URL changes — hard 404 vs soft 404, 410 Gone, redirect chains, sitemap cleanup, and refetch monitoring.

reference

Core Web Vitals and AI Citation Correlation: Does Page Speed Affect Citations?

What independent studies say about Core Web Vitals (LCP, INP, CLS, FCP) and AI citation rates across ChatGPT, Perplexity, and Google AI Overviews.

reference

Mobile-First Indexing and AI Crawlers: Parity Requirements for Citations

Per-crawler reference for desktop vs mobile fetch behavior across GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, and Googlebot Smartphone, plus parity rules.

Topics
Cập nhật tin tức

Thông tin GEO & AI Search

Bài viết mới, cập nhật khung làm việc và phân tích ngành. Không spam, hủy đăng ký bất cứ lúc nào.