AI Search Bot Changelog Reference

This reference tracks dated user-agent, robots.txt, and IP-range changes for the major AI search crawlers — GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-User, Claude-SearchBot, PerplexityBot, Perplexity-User, Google-Extended, and Bingbot — so infrastructure and SEO teams can keep robots.txt, WAF, and CDN rules aligned with current bot identities.

TL;DR

AI crawler identities change regularly: vendors split bots, bump version numbers, refresh IP allowlists, and quietly revise robots.txt language — usually without dedicated release notes.
OpenAI now operates three crawlers (GPTBot, OAI-SearchBot, ChatGPT-User) with overlapping but distinct robots.txt rules; ChatGPT-User's compliance language was narrowed in late 2025.
Anthropic clarified its three-bot model (ClaudeBot, Claude-User, Claude-SearchBot) in early 2026 and confirmed all three honor robots.txt, including the non-standard Crawl-delay directive.
Perplexity ships PerplexityBot and Perplexity-User, but Cloudflare has documented stealth crawling and rotating ASNs for some Perplexity traffic, so robots.txt alone is not sufficient.
Google-Extended (introduced 28 September 2023) is a robots.txt control token only; it shares user-agent strings with Googlebot and does not block AI Overviews.

Definition

The AI search bot changelog is a dated, vendor-by-vendor log of changes to the user-agent strings, robots.txt behavior, IP allowlists, and crawler roles of the bots that AI search products (ChatGPT, Claude, Perplexity, Gemini, Bing/Copilot) use to fetch web content. Unlike a traditional product changelog, AI crawler changes are scattered across vendor documentation pages, support articles, and announcement posts, and many changes are made silently — a string is updated, a new bot appears, or a robots.txt clause is rewritten without a public release note.

This reference consolidates those changes into per-bot tables so technical SEO and infrastructure teams can keep robots.txt, WAF rules, and CDN bot-management policies aligned with current bot identities. It is intentionally conservative: every entry is anchored to a primary vendor doc or a verifiable secondary source, and operational impact is described in plain language so the table can be read by both engineering and SEO stakeholders.

Why this matters

AI crawlers now account for a non-trivial share of bot traffic on most public sites, and their behavior diverges from the predictable patterns of search-engine crawlers. A robots.txt rule written when GPTBot launched in August 2023 will not, on its own, cover OAI-SearchBot, ChatGPT-User, or any of Anthropic's three Claude bots — all of which were either added later or had their compliance semantics renegotiated. Teams that do not track these changes typically run into three operational problems:

Stale block lists. Rules target bots that have been renamed, split, or merged, so new traffic is not actually controlled.
Accidental visibility loss. Blocking a bot meant for AI training (e.g., GPTBot) by reusing the same rule for a search-routing bot (e.g., OAI-SearchBot) can quietly remove the site from ChatGPT Search citations.
WAF false positives. IP allowlists drift; verified Bingbot or GPTBot IP ranges expand, and outdated WAF rules either over-block or under-block.

A maintained changelog converts these silent changes into a reviewable artifact that SEO, security, and platform teams can audit at a regular cadence.

How it works

Each entry below uses the same shape: date, change, source, operational impact. Dates reflect when the change was published or first verifiable in vendor documentation, not when individual sites observed it. "Source" links the primary vendor doc whenever available; secondary sources (Search Engine Land, Search Engine Roundtable, Cloudflare, Bing Webmaster Tools) are used when no primary doc exists.

For bots that share a user-agent string across multiple roles — Google-Extended is the canonical example — the user-agent stays unchanged and the change is in the robots.txt token semantics. Those entries note the distinction explicitly so engineers do not waste time looking for a separate UA in their access logs.

The reference is grouped by vendor and then by bot, so you can scan only the rows that affect a single robots.txt block. Where a vendor publishes a verified IP allowlist, the URL of the JSON file is listed in the row's notes; treat those JSON files as the source of truth for WAF rules and refresh them on a schedule rather than hardcoding ranges.

Practical application

OpenAI: GPTBot

Date	Change	Source	Impact
2023-08	GPTBot launched. Initial UA contains GPTBot/1.0 and +https://openai.com/gptbot. Honors robots.txt.	OpenAI launch announcement	First AI training opt-out token site owners can deploy.
2024 (mid)	UA bumped to GPTBot/1.2 (observed in production logs).	OpenAI bots overview	Pattern-match rules pinned to /1.0 exactly need to widen to GPTBot/.
2025 (late)	Current UA in vendor docs is Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.3; +https://openai.com/gptbot. Verified IP list at openai.com/gptbot.json.	OpenAI bots overview	Update WAF allowlists from the JSON file; do not hardcode IPs.
2025-Q4	OpenAI clarifies that GPTBot and OAI-SearchBot may share crawl results, so a site allowing both may be visited by either to satisfy both use cases.	Search Engine Roundtable, 2025	Blocking only one of GPTBot/OAI-SearchBot may not fully isolate training-adjacent crawling.

OpenAI: OAI-SearchBot

Date	Change	Source	Impact
2024-Q3	OAI-SearchBot introduced as the bot powering ChatGPT Search. Honors robots.txt. Distinct token from GPTBot.	OpenAI bots overview	Sites wanting ChatGPT Search citations must allow OAI-SearchBot specifically.
2025-Q4	Documentation clarifies OAI-SearchBot is no longer used to feed navigational links inside ChatGPT answers — blocking it does not strip those citation links.	Search Engine Roundtable, 2025	Citation strategies must account for the split between crawl source and answer-rendering surface.

OpenAI: ChatGPT-User

Date	Change	Source	Impact
2023-09	ChatGPT-User introduced for plugin-initiated browsing on behalf of a user. UA ChatGPT-User/1.0.	OpenAI bots overview	Treated as user-triggered fetches rather than bulk crawl.
2025-Q4	OpenAI narrows compliance language: robots.txt directives are described as applying to "OAI-SearchBot and GPTBot" rather than all three user agents. ChatGPT-User additionally cited as used for Custom GPTs and GPT Actions.	Search Engine Roundtable, 2025	Sites that relied on User-agent: ChatGPT-User to block live user fetches should re-test; those rules may now be advisory rather than authoritative.

Anthropic: ClaudeBot

Date	Change	Source	Impact
2024-Q1	ClaudeBot identified as Anthropic's training crawler. Honors robots.txt.	Anthropic Help Center	First clear opt-out token for Anthropic training data.
2024-2025	ClaudeBot traffic volume grew sharply on many sites; community reports describe sustained high request rates.	Practitioner reports (typically observed in webdev community posts, 2025)	Many teams added User-agent: ClaudeBot blocks or rate-limited at the CDN layer.
2026-Q1	Anthropic refreshes crawler docs to formalize a three-bot model and confirm Crawl-delay is honored alongside standard robots.txt directives.	Search Engine Land, 2026	Crawl-delay becomes a valid throttling tool for ClaudeBot — useful where outright blocking is undesirable.

Anthropic: Claude-User

Date	Change	Source	Impact
2026-Q1	Claude-User documented as the bot that fetches pages when a Claude user explicitly requests them. Honors robots.txt.	Search Engine Land, 2026	Blocking Claude-User reduces visibility in user-directed Claude answers.

Anthropic: Claude-SearchBot

Date	Change	Source	Impact
2026-Q1	Claude-SearchBot documented as the bot that improves Claude search-result quality. Honors robots.txt.	Anthropic Help Center	Blocking Claude-SearchBot reduces Claude search citation visibility but not training exposure.

Perplexity: PerplexityBot

Date	Change	Source	Impact
2023	PerplexityBot identified as Perplexity AI's index crawler. UA Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot).	Perplexity crawler docs	Standard robots.txt-honoring index bot for Perplexity citations.
2024-06	Independent reports document Perplexity ignoring robots.txt disallow on at least some hosts.	Robb Knight, 2024	Surfaces that robots.txt cannot be relied on alone for Perplexity.
2025	Cloudflare publishes evidence of Perplexity using stealth crawlers, rotating ASNs, and obscured user agents to evade no-crawl directives.	Cloudflare, 2025	Sites that need to block Perplexity should layer WAF/IP and behavioral rules on top of robots.txt.

Perplexity: Perplexity-User

Date	Change	Source	Impact
2025	Perplexity documents Perplexity-User as a distinct agent for live user-triggered queries, separate from PerplexityBot.	Perplexity crawler docs	Sites wanting to remain visible in Perplexity answers should keep Perplexity-User allowed even if PerplexityBot is restricted.

Google: Google-Extended

Date	Change	Source	Impact
2023-09-28	Google-Extended introduced as a robots.txt token to opt out of training Bard/Gemini and Vertex AI grounding. No standalone HTTP user-agent — crawling continues with existing Googlebot UAs.	Search Engine Land, 2023	Site owners get an AI-training opt-out without losing Google Search visibility.
2024-2025	Google clarifies that Google-Extended does not block AI Overviews; AI Overviews use the same content that powers Google Search.	Marie Haynes, 2025	Sites that disallow Google-Extended will still appear in AI Overviews if they remain in Google's regular index.
2025-Q4	Google publishes Google-Extended details inside the common-crawlers reference, formalizing scope across Gemini Apps and Vertex AI grounding.	Google for Developers	Single canonical source for current Google-Extended scope.

Microsoft: Bingbot

Date	Change	Source	Impact
Pre-2023	Bingbot operates as Bing's primary search crawler, UA family Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm). Verified IP list at bing.com/toolbox/bingbot.json.	Bing Webmaster Tools	Standard search crawler that also feeds Bing's AI products.
2023-2024	Bing's AI features (Bing Chat, then Copilot) draw on the existing Bing index — there is no separate "BingChat-Bot" UA.	Bing Webmaster Tools	Allowing Bingbot is the prerequisite for visibility in Microsoft's AI search surfaces.
2025	Bing reaffirms that robots.txt Crawl-delay overrides Webmaster Tools settings when both are present.	Bing Blogs	Use one mechanism, not both, to avoid conflicting throttling.

Common mistakes

Blocking GPTBot and assuming OpenAI is fully opted out. OAI-SearchBot and ChatGPT-User are separate tokens with separate robots.txt rules; some traffic types may also bypass robots.txt.
Treating Google-Extended like a normal crawler. It has no separate UA, so user-agent-based WAF rules cannot enforce it — the directive lives only in robots.txt.
Hardcoding IP allowlists. Vendors update openai.com/gptbot.json, bing.com/toolbox/bingbot.json, and equivalents; static lists rot within months.
Relying on robots.txt alone for Perplexity. Cloudflare-documented stealth-crawler behavior means network-layer enforcement is sometimes required.
Reusing one Crawl-delay rule across all bots. Only some bots (notably ClaudeBot per Anthropic's 2026 docs) honor Crawl-delay; Googlebot ignores it entirely.

FAQ

Q: How often should I update my AI crawler rules?

At minimum, review robots.txt and WAF rules quarterly, and re-pull verified IP allowlists from each vendor's JSON file at the same cadence. New bots and renamed tokens have shipped roughly every 6-9 months across major vendors since 2023.

Q: Does blocking GPTBot remove me from ChatGPT Search?

Not by itself. ChatGPT Search uses OAI-SearchBot, which is a separate robots.txt token. However, OpenAI's late-2025 documentation notes that GPTBot and OAI-SearchBot may share crawl results, so blocking only one may not fully isolate the use cases.

Q: Why do some AI bots ignore robots.txt?

Some user-triggered agents (ChatGPT-User, Claude-User, Perplexity-User) are framed as fetching on behalf of a person, similar to a browser, and vendors argue that robots.txt does not strictly apply. Other cases — most prominently the Cloudflare-documented Perplexity stealth crawlers — appear to circumvent directives entirely and require network-layer blocking.

Q: Is there a single user-agent string that catches all AI bots?

No. There is no shared substring across GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and Bingbot. Reliable detection requires either (a) maintaining a list of explicit tokens or (b) verifying vendor IP ranges from the published JSON files.

Q: Does Google-Extended block AI Overviews?

No. Google-Extended controls training and grounding for Gemini Apps and Vertex AI, but AI Overviews continue to use content from the regular Google Search index. Removing your site from AI Overviews requires removing it from Google Search itself, which is rarely the desired outcome.

Q: How do I tell a real GPTBot or Bingbot from a spoofed one?

Verify the source IP against the vendor-published JSON allowlist (openai.com/gptbot.json, bing.com/toolbox/bingbot.json). User-agent strings can be forged trivially; verified IPs cannot.

Q: What's the safest default robots.txt posture for a documentation site that wants AI citations?

Allow GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-User, Claude-SearchBot, PerplexityBot, Perplexity-User, and Bingbot; leave Google-Extended allowed if you want Gemini grounding and disallow only if you want to opt out of AI training. Layer rate limits at the CDN, not in robots.txt.