Lazy-Loading Impact on AI Crawlers: What Gets Indexed vs Skipped

Lazy-loading defers content until a user (or simulated viewport) reaches it. Most AI search crawlers do not execute JavaScript, do not scroll, and do not simulate a viewport, so JavaScript-driven lazy-loading effectively hides that content from GPTBot, ClaudeBot, PerplexityBot, and OAI-SearchBot. Native HTML loading="lazy" is safe; client-side patterns require server-side rendering or HTML fallbacks.

TL;DR

Use native loading="lazy" for images and iframes — it is HTML-attribute based and visible to every AI crawler. Avoid JavaScript-only lazy-loading patterns (IntersectionObserver swapping data-src to src) on critical content unless you also server-side render the final HTML. Gemini and Google AI Overviews can render JS via Googlebot infrastructure; ChatGPT, Claude, and Perplexity crawlers cannot.

Per-crawler reference

Crawler	Renders JavaScript	Sees native loading="lazy"	Sees JS lazy-load (IntersectionObserver)	Notes
GPTBot (training)	No	Yes (HTML attribute)	No	Fetches raw HTML only (Vercel, 2025)
OAI-SearchBot	No	Yes	No	Same engine class as GPTBot
ChatGPT-User	No	Yes	No	User-initiated fetch, still no JS execution
ClaudeBot	No	Yes	No	Anthropic crawler, no JS rendering (ClickRank, 2026)
PerplexityBot	No	Yes	No	Perplexity confirmed no JS execution
Googlebot	Yes (deferred)	Yes	Yes (after rendering)	Two-pass index with full JS (Google, 2025)
GoogleOther	Yes	Yes	Yes	Shares Googlebot infra
Google-Extended	Yes	Yes	Yes	Same rendering pipeline
Gemini (via Google infra)	Yes	Yes	Yes	Inherits Googlebot rendering
Bingbot	Limited	Yes	Partial	Some JS execution but less reliable than Googlebot
Bytespider (ByteDance)	No	Yes	No	Often ignores robots.txt

Patterns and AI-crawler behavior

Native loading="lazy" (safe)

<img src="/hero.jpg" loading="lazy" alt="Annotated diagram of an AI search funnel" width="1200" height="600">

The src and alt attributes are present in the initial HTML response. Every crawler — JS-rendering or not — sees them (web.dev, 2025). Always include explicit width and height to avoid CLS.

IntersectionObserver swap (unsafe without SSR)

<img data-src="/hero.jpg" alt="..." class="lazy">
<script>/* swaps data-src → src on viewport entry */</script>

GPTBot, ClaudeBot, and PerplexityBot see only the data-src attribute. They do not extract the image. They also miss any text rendered into the DOM by client-side JS frameworks (React, Vue, Svelte) without server-side rendering (Passionfruit, 2026).

Background images via CSS

CSS-only background-image: url(...) is parsed by JS-rendering crawlers but ignored by those that only consume HTML. Use tags with loading="lazy" for any image whose alt text is meaningful for retrieval.

Below-the-fold text in tabs and accordions

If the markup is in the initial HTML (collapsed via CSS), AI crawlers see it. If it is fetched via XHR on click, only JS-rendering crawlers (Googlebot, Bingbot partial) see it.

Audit methodology

Use two curl commands to compare what AI crawlers see vs. what your browser renders:

curl -A "Mozilla/5.0 (compatible; GPTBot/1.0)" https://example.com/article -o ai.html
curl -A "Mozilla/5.0 (Chrome/120)" https://example.com/article -o human.html
diff <(grep -oE 'src="[^"]+"' ai.html | sort) 
     <(grep -oE 'src="[^"]+"' human.html | sort)

If the diff shows images present in human.html but missing in ai.html, those images are JS-loaded and invisible to non-rendering crawlers. Repeat with ClaudeBot, PerplexityBot, and Applebot user-agent strings; check server logs for cloaking detection rules first.

Audit checklist

[ ] Critical images use loading="lazy" HTML attribute, not data-src swaps.
[ ] All tags carry alt, width, and height.
[ ] Article body text is in the initial HTML response (server-side rendered if framework-based).
[ ] Structured data (JSON-LD) is in initial HTML, not injected by client-side JS.
[ ] Tabbed and accordion content is present in DOM at load, hidden via CSS.
[ ] Diff between AI-bot user-agent fetch and human fetch shows minimal content gap.

Common mistakes

Replacing with and a JS lazy library on hero or in-body images.
Hydrating React/Vue/Svelte client-side without an SSR or static export step.
Loading FAQ JSON-LD via fetch() after page load.
Using display: none on critical sections — some crawlers discount hidden content.
Trusting that "Googlebot sees it, so AI bots will too" — they will not.

FAQ

Q: Do GPTBot and ClaudeBot run JavaScript?

No. Independent log-file analysis confirms ChatGPT and Claude crawlers fetch JavaScript files but do not execute them, so any content rendered client-side is invisible (Vercel, 2025).

Q: Is native HTML loading="lazy" safe for AI crawlers?

Yes. The attribute is parsed at HTML level and the src and alt are present in the initial response. All crawlers see it.

Q: What about Gemini and Google AI Overviews?

They rely on Googlebot's rendering pipeline, which executes JavaScript in a deferred second pass. JS lazy-loading is mostly handled, but only if the DOM eventually settles and content enters the viewport simulation Googlebot performs.

Q: How do I make a React or Next.js site visible to AI crawlers?

Use server-side rendering (getServerSideProps, RSC streaming with proper HTML emission) or static export. The initial HTML response must contain the article body and key media references. ChatGPT and Claude bots will not wait for hydration.

Q: Should I block AI crawlers from images entirely?

Only if you have a specific licensing or rights reason. Blocking via robots.txt removes you from AI citation surfaces while still leaving you scrapable by non-compliant bots; consider a noai/noimageai strategy plus C2PA opt-out instead.