Video Sitemap Specification for AI Search Citations
A video sitemap is an XML file using Google's video:video namespace to declare video URLs and metadata (title, thumbnail, duration, content or player location). It is the discovery layer multimodal AI search engines use before pairing the video with transcripts and VideoObject JSON-LD for citation.
TL;DR
List every important video in an XML sitemap using the video:video extension. Required tags: video:thumbnail_loc, video:title, video:description, and one of video:content_loc or video:player_loc. Add duration, publication_date, tag, family_friendly, and live for richer signals. Pair each video page with on-page transcripts and VideoObject JSON-LD for AI citation eligibility.
Definition
A video sitemap is a sitemap using Google's video:video namespace to expose video metadata that base sitemaps cannot describe. Tags live in the namespace http://www.google.com/schemas/sitemap-video/1.1 (Google Search Central, 2024). Each
Why it matters for AI search
AI Mode and AI Overviews increasingly synthesize video citations alongside text results, and Google AI Mode supports image-based queries that may be answered with video clips (Google, 2025). For AI engines to cite a video, they must first discover it (sitemap), understand it (thumbnail + title + description, plus transcript), and trust it (publication date, license, page authority).
Three concrete impacts:
- Discovery for embedded or hosted video. Engines often cannot extract a usable video reference from JavaScript embeds without a sitemap entry.
- Snippet candidates. description and tag provide citable text; thumbnail_loc provides the visual snippet.
- Constraint-aware results. restriction, platform, requires_subscription, and live let engines avoid showing your video where it cannot play.
Required and recommended fields
| Tag | Required | Purpose |
|---|---|---|
| Yes | Namespace declaration. | |
| Yes | Page URL hosting the video. | |
| Yes | Container; up to 1 per video, multiple per page allowed. | |
| Yes | Image URL for the thumbnail. | |
| Yes | Plain text or CDATA; HTML-escaped. | |
| Yes | Up to 2,048 chars. | |
| One of these required | Direct video file URL (mp4, mov, etc.). | |
| One of these required | Embed/player URL (e.g., YouTube embed). | |
| Recommended | Seconds (1-28,800). | |
| Recommended | ISO-8601 datetime. | |
| Optional | Stop showing in results after this date. | |
| Optional | yes or no. | |
| Optional | Up to 32 tags per video. | |
| Optional | Up to 256 chars. | |
| Optional | ISO 3166 country codes. | |
| Optional | web, mobile, tv. | |
| Optional | yes or no. | |
| Optional | yes or no. | |
| Optional | 0.0-5.0. | |
| Optional | Integer view count. |
File constraints (Google guidance): a video sitemap can contain up to 50,000 entries and must be ≤ 50 MB uncompressed. Use a sitemap index for larger sets. Source files must be accessible to Googlebot — not blocked by robots.txt, login, or streaming-only protocols (Google Search Central, 2024).
How AI engines use the video sitemap
flowchart LR
A["Crawler reads sitemap.xml"] --> B["Parse video:video entries"]
B --> C["Fetch thumbnail + page"]
C --> D["Read on-page transcript
+ VideoObject JSON-LD"]
D --> E["Index visual + textual
signals together"]
E --> F["AI answer cites video
with thumbnail + snippet"]The sitemap is discovery only. Citation quality depends on the on-page artifacts: a high-quality thumbnail, a descriptive title, and — critically — a textual transcript that AI engines can quote. Pair the sitemap with VideoObject JSON-LD that includes a transcript URL for the strongest stack.
Canonical XML example
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
<url>
<loc>https://example.com/learn/ai-mode-walkthrough</loc>
<video:video>
<video:thumbnail_loc>https://example.com/img/ai-mode-thumb.jpg</video:thumbnail_loc>
<video:title>How Google AI Mode answers multimodal queries</video:title>
<video:description>A 6-minute walkthrough showing how AI Mode handles image and text queries, with examples and citation behavior.</video:description>
<video:content_loc>https://example.com/video/ai-mode-walkthrough.mp4</video:content_loc>
<video:player_loc>https://example.com/embed/ai-mode-walkthrough</video:player_loc>
<video:duration>372</video:duration>
<video:publication_date>2026-04-21T10:00:00+00:00</video:publication_date>
<video:family_friendly>yes</video:family_friendly>
<video:tag>AI Mode</video:tag>
<video:tag>multimodal search</video:tag>
<video:requires_subscription>no</video:requires_subscription>
<video:live>no</video:live>
</video:video>
</url>
</urlset>Implementation patterns (5 examples)
1. Self-hosted video on a docs site
Use content_loc with a direct mp4 URL, plus a transcript on the page and VideoObject.transcript JSON-LD. Set family_friendly: yes and requires_subscription: no to maximize eligibility.
2. YouTube embed
Use player_loc pointing to the embed URL. Provide a copy of the transcript on your page (YouTube auto-captions are not sufficient for AI citation grounding because engines tie quotes to the page URL).
3. Vimeo / Wistia private video
If the video requires authentication, set requires_subscription: yes. Engines may still index metadata for context but will not surface the player.
4. Live streams
Set live: yes and update expiration_date after the stream ends. Replace with the on-demand entry post-event.
5. Geographically restricted content
Use
Common errors and validator quirks
- Missing content_loc and player_loc — at least one is required.
- Thumbnail too small — follow Google's thumbnail size guidance; small or low-quality images suppress eligibility.
- Duration out of range — must be 1-28,800 seconds.
- Title with raw HTML — wrap in CDATA or escape entities.
- Sitemap blocked by robots.txt — verify the sitemap URL and the video URLs are crawlable.
- Conflicting page schema — if VideoObject JSON-LD on the page disagrees with the sitemap (different duration, thumbnail), engines may distrust both.
Video sitemap vs related signals
| Signal | Strength for AI discovery | Notes |
|---|---|---|
| Baseline | Required; sourced via | |
| Video sitemap | High | Closes JS-embed gaps |
| VideoObject JSON-LD | High | Pairs with transcript URL |
| On-page transcript | Critical | Citable text for AI answers |
| Open Graph og:video | Medium | Used for social previews |
Common mistakes
- Submitting a video sitemap inside the standard sitemap.xml file (separate is recommended).
- Skipping the on-page transcript — AI engines need quotable text.
- Using YouTube auto-captions only — not as accurate or as durable as a self-hosted transcript.
- Setting expiration_date too aggressively and orphaning citation paths.
- Forgetting to update sitemap on video re-encodes that change content_loc.
How to validate and deploy
- Generate the video sitemap from your video CMS or manifest.
- Validate XML structure and submit via Google Search Console.
- Reference it from robots.txt via the Sitemap: directive.
- Verify each content_loc / player_loc is reachable to Googlebot and AI crawlers.
- Re-generate on every video publish, replace, or removal.
FAQ
Q: Can I include a video sitemap inside my main sitemap.xml?
Google's documentation strongly recommends keeping the video sitemap as a separate file. It is easier to debug and resubmit, and avoids 50 MB / 50,000-entry limits affecting your main sitemap.
Q: Do I need both content_loc and player_loc?
At least one is required. If both are present, content_loc is preferred when accessible because engines can probe the file directly.
Q: Is there a video:transcript tag?
No. Google's video sitemap namespace does not define a transcript tag. Provide the transcript on the hosting page (HTML) and reference it via VideoObject.transcript JSON-LD.
Q: Do AI crawlers like GPTBot read video sitemaps?
Major AI crawlers honor robots.txt and sitemap directives. Listing videos in a sitemap improves the chance they fetch the hosting page and pair it with on-page transcript text for citation.
Q: Should I include YouTube videos hosted off my domain?
Yes, when the video is embedded on a page you control. Use player_loc with the embed URL. The hosting page becomes the citation target.
Q: How often should I regenerate the sitemap?
On every publish, replacement, or metadata change. Sitemaps with stale lastmod or duration drift erode discovery quality.
Related Articles
BreadcrumbList Schema Specification for AI Search Citation Context
BreadcrumbList schema specification: required fields, position ordering, and how AI engines use breadcrumb structured data to disambiguate citations.
Image Sitemap Specification for Multimodal AI Citations
Image sitemap specification for multimodal AI citations: image:image markup, captions, license, geo-location, and signals AI engines extract for visual search.
JavaScript SPA Hydration Patterns for AI Crawlers
JavaScript SPA hydration patterns for AI crawlers: rendering modes, mismatch fixes, and framework-specific strategies for GPTBot, ClaudeBot, PerplexityBot.