WebSite Schema and SearchAction Specification for AI Search
WebSite schema is site-level JSON-LD that declares the canonical identity of a domain to AI search engines through @id, url, name, alternateName, and publisher properties. SearchAction nested as potentialAction tells AI agents how to invoke the site's internal search even after Google retired the Sitelinks Search Box on November 21, 2024.
TL;DR
WebSite schema is the root identity object for an entire domain. Place a single WebSite JSON-LD block on the homepage with @id, url, name, alternateName[], and publisher (linking to your Organization schema). Add SearchAction inside potentialAction if you want AI agents and remaining sitelinks-search clients to discover your internal search endpoint. Use a canonical @id URL ending with #website so other schema types can reference it via isPartOf.
What is WebSite schema
WebSite schema is a schema.org type (Thing > CreativeWork > WebSite) that describes a set of related web pages served from a single domain. Unlike WebPage schema, which describes one page, WebSite schema describes the domain itself: its canonical name, brand variants, language, publisher, and search interface.
For AI search engines, WebSite schema serves three jobs:
- Identity anchor. It pins the domain to a stable @id that other JSON-LD blocks (Article, BreadcrumbList, FAQPage) reference via isPartOf.
- Brand alias resolution. The alternateName array lets engines resolve brand variants (e.g., "Geodocs" vs "Geodocs.dev") to one entity.
- Search interface declaration. SearchAction tells agents where the internal search endpoint lives and how to construct queries.
Required and recommended properties
Required (for AI search citation eligibility)
| Property | Type | Notes |
|---|---|---|
| @context | URL | Always https://schema.org |
| @type | Text | WebSite |
| @id | URL | Canonical identifier, e.g. https://example.com/#website |
| url | URL | The site's canonical homepage URL |
| name | Text | Primary brand name (≤ 60 chars recommended) |
| publisher | Organization | Reference to Organization @id |
Recommended
| Property | Type | Notes |
|---|---|---|
| alternateName | Text[] | Brand variants, abbreviations, recognized misspellings |
| inLanguage | Text | BCP 47 code, e.g. en or en-US |
| description | Text | 120-160 character site description |
| potentialAction | SearchAction | Internal search endpoint declaration |
| copyrightYear | Number | Useful for AI freshness signals |
| copyrightHolder | Organization | Often the same as publisher |
SearchAction specification
SearchAction is the schema.org type used inside potentialAction to declare an internal search endpoint. A valid SearchAction must include:
- @type: SearchAction
- target: an EntryPoint with urlTemplate describing the search URL pattern, e.g. https://example.com/?s={search_term_string}
- query-input: a PropertyValueSpecification with valueRequired: true and valueName matching the placeholder in the urlTemplate (typically search_term_string)
Note: Google retired the Sitelinks Search Box globally on November 21, 2024. SearchAction is no longer rendered as a search box under SERP results. It remains useful for AI agents, custom search clients, and as a forward-compatible signal.
Canonical JSON-LD example
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "WebSite",
"@id": "https://example.com/#website",
"url": "https://example.com/",
"name": "Example Docs",
"alternateName": ["Example", "ExampleDocs", "example.com"],
"inLanguage": "en-US",
"description": "Technical documentation for AI search optimization.",
"publisher": { "@id": "https://example.com/#organization" },
"potentialAction": {
"@type": "SearchAction",
"target": {
"@type": "EntryPoint",
"urlTemplate": "https://example.com/search?q={search_term_string}"
},
"query-input": "required name=search_term_string"
}
}
</script>How AI engines use WebSite schema
AI search engines (Google AI Overviews, Perplexity, ChatGPT Search, Bing Copilot) ingest structured data as one signal among many during retrieval and citation. WebSite schema specifically helps with:
- Entity disambiguation. When two domains share a name, alternateName arrays and publisher links resolve which entity is being cited.
- Citation attribution. The name from WebSite schema is the canonical brand label engines may surface in citation chips.
- Cross-document binding. Article and BreadcrumbList schemas reference WebSite via isPartOf: { @id }, giving engines a graph view of the domain's content.
- Multilingual routing. inLanguage plus translation (where applicable) helps engines route queries to the correct localized version.
Schema markup is not a citation guarantee — it is a clarity signal. Industry coverage of structured data and AI search emphasizes that schema reduces ambiguity rather than directly boosting visibility.
Implementation patterns
Pattern 1: Single homepage block (most sites)
Place one WebSite JSON-LD block on the homepage only. Reference its @id from every other page's schema. Avoid duplicating WebSite blocks on subpages.
Pattern 2: Multilingual sites
Emit one WebSite block per language version with locale-suffixed @ids: #website-en, #website-fr. Each version's pages reference its own language-scoped WebSite @id.
Pattern 3: Documentation sites
For docs domains, publisher is usually the parent Organization. Add inLanguage: "en" and a docs-specific description. The internal SearchAction urlTemplate should match the docs search endpoint, not a marketing site search.
Pattern 4: SaaS apps with separate marketing and app domains
Emit a WebSite block on the marketing domain only. The app domain typically does not need WebSite schema since it is gated and not indexed.
Pattern 5: News and editorial sites
Add copyrightHolder, copyrightYear, and issn if applicable. AI engines that weight freshness benefit from explicit copyrightYear values.
vs related schema types
| Schema | Scope | Typical placement |
|---|---|---|
| WebSite | Whole domain | Homepage only |
| WebPage | One page | Every indexable page |
| Organization | Brand entity | Homepage, About page |
| Article | Individual content piece | Article page |
WebSite is not a replacement for Organization. They cohabit: WebSite has a publisher property that resolves to an Organization @id. Both blocks should be present on the homepage of a branded site.
Common mistakes
- Duplicating WebSite blocks on every page. WebSite belongs only on the homepage; other pages reference its @id.
- Skipping @id. Without an @id, other schema types cannot reference WebSite via isPartOf, breaking the graph.
- Setting urlTemplate to a third-party search engine URL. The urlTemplate must point at the site's own internal search endpoint.
- Filling alternateName with keyword variants. alternateName is for legitimate brand variants, abbreviations, and recognized misspellings — not SEO keywords.
- Omitting publisher. Without publisher, AI engines cannot bind WebSite to a brand entity, weakening citation attribution.
- Using query instead of query-input in SearchAction. Schema.org's documented form is query-input.
How to validate
- Run the Schema Markup Validator on your homepage URL. WebSite errors here block AI ingestion.
- Run Google's Rich Results Test for legacy diagnostic value (sitelinks search box is no longer rendered, but the test still flags structural errors).
- Curl the homepage HTML and grep for "@type":"WebSite". There should be exactly one match.
- Verify the SearchAction urlTemplate by replacing {search_term_string} with a test term and loading the URL. It should return a search results page.
FAQ
Q: Should I keep SearchAction now that Google retired the Sitelinks Search Box?
Yes, with reduced expectations. Google announced the Sitelinks Search Box retirement in October 2024 and removed it globally on November 21, 2024. SearchAction no longer triggers a Google SERP search box. AI agents, custom clients, and Yandex still consume it, and adding it costs nothing. Treat it as a forward-compatible signal rather than a Google rich result trigger.
Q: Where should the WebSite JSON-LD block live?
Only on the homepage. Other pages reference its @id (https://example.com/#website) via isPartOf from their own WebPage or Article schema. Duplicating WebSite on every page does not help and clutters the graph.
Q: Can I have multiple WebSite blocks for multilingual sites?
Yes. Emit one block per locale with distinct @ids (e.g., #website-en, #website-fr). Each locale's pages reference the matching WebSite @id. Set inLanguage correctly on each block.
Q: Does WebSite schema replace Organization schema?
No. WebSite describes the site; Organization describes the legal/brand entity. WebSite has a publisher property that resolves to an Organization @id. Both are typically present on the homepage.
Q: How does WebSite schema affect AI Overviews and Perplexity citations?
Indirectly. WebSite schema does not guarantee citations. It clarifies the domain's identity (name, alternateName, publisher), which AI engines use during entity resolution and citation attribution. Sites with consistent, validated WebSite schema have cleaner citation chips.
Q: Should the urlTemplate include https or http?
Use https. AI agents and modern crawlers expect HTTPS urlTemplates. Mixed-content urlTemplates can be silently dropped during validation.
Related Articles
BreadcrumbList Schema Specification for AI Search Citation Context
BreadcrumbList schema specification: required fields, position ordering, and how AI engines use breadcrumb structured data to disambiguate citations.
Image Sitemap Specification for Multimodal AI Citations
Image sitemap specification for multimodal AI citations: image:image markup, captions, license, geo-location, and signals AI engines extract for visual search.
Organization Schema Specification for AI Brand Citations
Organization schema specification for AI brand citations: required fields, sameAs entity linking, logo, ContactPoint, and how LLMs verify brand identity.