XML Sitemap Priority and Changefreq for AI Search Crawlers
XML sitemaps are still the most important discovery surface for AI crawlers, but the legacy
TL;DR
Keep your sitemap minimal:
Field semantics
The sitemaps.org schema defines five elements per
| Field | Required | AI-search relevance |
|---|---|---|
| Yes | High — the URL itself | |
| No | High — most useful sitemap signal when accurate | |
| No | Low — mostly ignored (Slickplan, 2025) | |
| No | Low — mostly ignored | |
| Image / video / news extensions | No | Medium — useful for media-rich content |
Gary Illyes confirmed Google "mostly ignores"
and
Google openly states it ignores both. There is no public documentation that GPTBot, OAI-SearchBot, ClaudeBot, or PerplexityBot consult either. Bingbot honors them as weak hints. Setting
Canonical-only rule
Include only canonical URLs you want indexed. Excluding noindex, redirected, blocked, or duplicate URLs is mandatory — their presence weakens the trust of the entire sitemap (Google, 2014).
AI-crawler observed behavior
| Crawler | Reads sitemap | Honors | Honors |
|---|---|---|---|
| Googlebot / GoogleOther / Google-Extended | Yes | As hint when accurate | No |
| Bingbot | Yes | Yes | Weak hint |
| GPTBot / OAI-SearchBot | Yes (per OpenAI docs) | Likely (timestamp comparison) | No documented use |
| ClaudeBot | Yes | Likely | No documented use |
| PerplexityBot | Yes | Likely | No documented use |
| Bytespider | Inconsistent | — | — |
The practical conclusion: AI crawlers use sitemaps for URL discovery and rely on
Sample sitemap (recommended shape)
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/article-a</loc>
<lastmod>2026-05-03T09:00:00+00:00</lastmod>
</url>
<url>
<loc>https://example.com/article-b</loc>
<lastmod>2026-04-30T12:00:00+00:00</lastmod>
</url>
</urlset>For sites over 50,000 URLs, use a sitemap index pointing to per-section sitemaps (/sitemap-articles.xml, /sitemap-glossary.xml, etc.).
Sitemap freshness: pings and IndexNow
Google deprecated the /ping?sitemap=... HTTP endpoint in 2023. Today the recommended workflow is:
- Google Search Console — submit each sitemap once; Google polls them periodically.
- Bing Webmaster Tools — submit and enable IndexNow.
- IndexNow — on every meaningful publish or update, ping https://api.indexnow.org/indexnow?url=...&key=.... Supported by Microsoft Bing, Naver, Seznam.cz, Yandex, and Yep (IndexNow.org, 2026; Bing, 2026).
Bing now recommends IndexNow over manual URL Submission as the primary path (Bing Webmaster Tools, 2026). Because ChatGPT Search is layered on Bing's index, IndexNow indirectly accelerates ChatGPT visibility — community write-ups consistently report faster appearance after IndexNow adoption.
IndexNow request shape
POST /indexnow HTTP/1.1
Host: api.indexnow.org
Content-Type: application/json{
"host": "example.com",
"key": "abcdef1234567890abcdef1234567890",
"keyLocation": "https://example.com/abcdef1234567890abcdef1234567890.txt",
"urlList": [
"https://example.com/article-a",
"https://example.com/article-b"
]
}
Host the key file at the documented keyLocation. Most CMS and static-site tools (Next.js, Astro, Hugo plugins) ship IndexNow integrations.
Audit checklist
- [ ] Sitemap returns 200 with Content-Type: application/xml.
- [ ] All URLs are canonical, not redirected, not blocked.
- [ ]
reflects the actual last meaningful change. - [ ]
and are absent or all set to a single benign value. - [ ] Sitemap submitted in Google Search Console and Bing Webmaster Tools.
- [ ] IndexNow key file hosted; pings fire on publish.
- [ ] Sitemap index used if URL count exceeds 50,000.
- [ ] Each child sitemap is under 50 MB uncompressed.
- [ ] Sitemap: directive present in robots.txt.
Common mistakes
- Setting
1.0 on every URL. - Bumping
daily on unchanged content (it teaches crawlers to ignore it). - Including URLs that 301 to elsewhere or return noindex.
- Forgetting to ping IndexNow on edits — the protocol is push-based.
- Mixing canonical and parameter-laden URLs in one sitemap.
- Relying on the deprecated Google ping endpoint.
FAQ
Q: Do AI search crawlers read XML sitemaps?
Yes. OpenAI, Anthropic, and Perplexity all use sitemap discovery for URL enumeration. They rely on
Q: Should I remove and entirely?
You can. They have no documented positive effect for Google or for AI crawlers and they encourage drift. If you keep them, set them once based on your site structure and do not touch them again.
Q: How does IndexNow help ChatGPT visibility?
ChatGPT Search uses the Bing index. IndexNow accelerates Bing indexing of changed URLs, which in turn accelerates ChatGPT freshness. There is no direct IndexNow endpoint for OpenAI (IndexNow.org, 2026).
Q: What is the maximum sitemap size?
50,000 URLs and 50 MB uncompressed per the sitemaps.org spec. Use a sitemap index when you exceed either limit.
Q: Does the deprecated Google sitemap ping still work?
No. Google deprecated the /ping?sitemap=... endpoint in 2023. Submit sitemaps through Search Console and rely on lastmod for ongoing freshness.
Q: Should I submit RSS or Atom feeds in addition to a sitemap?
Yes for news and frequently updated content. Search engines and AI crawlers benefit from feeds for freshness signals while sitemaps anchor full URL coverage.
Related Articles
404 Page AI Crawler Handling: Avoiding Citation Loss During Migrations
Migration playbook for keeping AI citations during URL changes — hard 404 vs soft 404, 410 Gone, redirect chains, sitemap cleanup, and refetch monitoring.
Core Web Vitals and AI Citation Correlation: Does Page Speed Affect Citations?
What independent studies say about Core Web Vitals (LCP, INP, CLS, FCP) and AI citation rates across ChatGPT, Perplexity, and Google AI Overviews.
Lazy-Loading Impact on AI Crawlers: What Gets Indexed vs Skipped
Per-crawler reference for how GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, GoogleOther, and Bingbot handle native and JS-driven lazy-loaded content.