Geodocs.dev

WebSite Schema and SearchAction Specification for AI Search

ShareLinkedIn

Open this article in your favorite AI assistant for deeper analysis, summaries, or follow-up questions.

WebSite schema is site-level JSON-LD that declares the canonical identity of a domain to AI search engines through @id, url, name, alternateName, and publisher properties. SearchAction nested as potentialAction tells AI agents how to invoke the site's internal search even after Google retired the Sitelinks Search Box on November 21, 2024.

TL;DR

WebSite schema is the root identity object for an entire domain. Place a single WebSite JSON-LD block on the homepage with @id, url, name, alternateName[], and publisher (linking to your Organization schema). Add SearchAction inside potentialAction if you want AI agents and remaining sitelinks-search clients to discover your internal search endpoint. Use a canonical @id URL ending with #website so other schema types can reference it via isPartOf.

What is WebSite schema

WebSite schema is a schema.org type (Thing > CreativeWork > WebSite) that describes a set of related web pages served from a single domain. Unlike WebPage schema, which describes one page, WebSite schema describes the domain itself: its canonical name, brand variants, language, publisher, and search interface.

For AI search engines, WebSite schema serves three jobs:

  1. Identity anchor. It pins the domain to a stable @id that other JSON-LD blocks (Article, BreadcrumbList, FAQPage) reference via isPartOf.
  2. Brand alias resolution. The alternateName array lets engines resolve brand variants (e.g., "Geodocs" vs "Geodocs.dev") to one entity.
  3. Search interface declaration. SearchAction tells agents where the internal search endpoint lives and how to construct queries.

Required (for AI search citation eligibility)

PropertyTypeNotes
@contextURLAlways https://schema.org
@typeTextWebSite
@idURLCanonical identifier, e.g. https://example.com/#website
urlURLThe site's canonical homepage URL
nameTextPrimary brand name (≤ 60 chars recommended)
publisherOrganizationReference to Organization @id
PropertyTypeNotes
alternateNameText[]Brand variants, abbreviations, recognized misspellings
inLanguageTextBCP 47 code, e.g. en or en-US
descriptionText120-160 character site description
potentialActionSearchActionInternal search endpoint declaration
copyrightYearNumberUseful for AI freshness signals
copyrightHolderOrganizationOften the same as publisher

SearchAction specification

SearchAction is the schema.org type used inside potentialAction to declare an internal search endpoint. A valid SearchAction must include:

  • @type: SearchAction
  • target: an EntryPoint with urlTemplate describing the search URL pattern, e.g. https://example.com/?s={search_term_string}
  • query-input: a PropertyValueSpecification with valueRequired: true and valueName matching the placeholder in the urlTemplate (typically search_term_string)

Note: Google retired the Sitelinks Search Box globally on November 21, 2024. SearchAction is no longer rendered as a search box under SERP results. It remains useful for AI agents, custom search clients, and as a forward-compatible signal.

Canonical JSON-LD example

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "WebSite",
  "@id": "https://example.com/#website",
  "url": "https://example.com/",
  "name": "Example Docs",
  "alternateName": ["Example", "ExampleDocs", "example.com"],
  "inLanguage": "en-US",
  "description": "Technical documentation for AI search optimization.",
  "publisher": { "@id": "https://example.com/#organization" },
  "potentialAction": {
    "@type": "SearchAction",
    "target": {
      "@type": "EntryPoint",
      "urlTemplate": "https://example.com/search?q={search_term_string}"
    },
    "query-input": "required name=search_term_string"
  }
}
</script>

How AI engines use WebSite schema

AI search engines (Google AI Overviews, Perplexity, ChatGPT Search, Bing Copilot) ingest structured data as one signal among many during retrieval and citation. WebSite schema specifically helps with:

  • Entity disambiguation. When two domains share a name, alternateName arrays and publisher links resolve which entity is being cited.
  • Citation attribution. The name from WebSite schema is the canonical brand label engines may surface in citation chips.
  • Cross-document binding. Article and BreadcrumbList schemas reference WebSite via isPartOf: { @id }, giving engines a graph view of the domain's content.
  • Multilingual routing. inLanguage plus translation (where applicable) helps engines route queries to the correct localized version.

Schema markup is not a citation guarantee — it is a clarity signal. Industry coverage of structured data and AI search emphasizes that schema reduces ambiguity rather than directly boosting visibility.

Implementation patterns

Pattern 1: Single homepage block (most sites)

Place one WebSite JSON-LD block on the homepage only. Reference its @id from every other page's schema. Avoid duplicating WebSite blocks on subpages.

Pattern 2: Multilingual sites

Emit one WebSite block per language version with locale-suffixed @ids: #website-en, #website-fr. Each version's pages reference its own language-scoped WebSite @id.

Pattern 3: Documentation sites

For docs domains, publisher is usually the parent Organization. Add inLanguage: "en" and a docs-specific description. The internal SearchAction urlTemplate should match the docs search endpoint, not a marketing site search.

Pattern 4: SaaS apps with separate marketing and app domains

Emit a WebSite block on the marketing domain only. The app domain typically does not need WebSite schema since it is gated and not indexed.

Pattern 5: News and editorial sites

Add copyrightHolder, copyrightYear, and issn if applicable. AI engines that weight freshness benefit from explicit copyrightYear values.

SchemaScopeTypical placement
WebSiteWhole domainHomepage only
WebPageOne pageEvery indexable page
OrganizationBrand entityHomepage, About page
ArticleIndividual content pieceArticle page

WebSite is not a replacement for Organization. They cohabit: WebSite has a publisher property that resolves to an Organization @id. Both blocks should be present on the homepage of a branded site.

Common mistakes

  • Duplicating WebSite blocks on every page. WebSite belongs only on the homepage; other pages reference its @id.
  • Skipping @id. Without an @id, other schema types cannot reference WebSite via isPartOf, breaking the graph.
  • Setting urlTemplate to a third-party search engine URL. The urlTemplate must point at the site's own internal search endpoint.
  • Filling alternateName with keyword variants. alternateName is for legitimate brand variants, abbreviations, and recognized misspellings — not SEO keywords.
  • Omitting publisher. Without publisher, AI engines cannot bind WebSite to a brand entity, weakening citation attribution.
  • Using query instead of query-input in SearchAction. Schema.org's documented form is query-input.

How to validate

  1. Run the Schema Markup Validator on your homepage URL. WebSite errors here block AI ingestion.
  2. Run Google's Rich Results Test for legacy diagnostic value (sitelinks search box is no longer rendered, but the test still flags structural errors).
  3. Curl the homepage HTML and grep for "@type":"WebSite". There should be exactly one match.
  4. Verify the SearchAction urlTemplate by replacing {search_term_string} with a test term and loading the URL. It should return a search results page.

FAQ

Yes, with reduced expectations. Google announced the Sitelinks Search Box retirement in October 2024 and removed it globally on November 21, 2024. SearchAction no longer triggers a Google SERP search box. AI agents, custom clients, and Yandex still consume it, and adding it costs nothing. Treat it as a forward-compatible signal rather than a Google rich result trigger.

Q: Where should the WebSite JSON-LD block live?

Only on the homepage. Other pages reference its @id (https://example.com/#website) via isPartOf from their own WebPage or Article schema. Duplicating WebSite on every page does not help and clutters the graph.

Q: Can I have multiple WebSite blocks for multilingual sites?

Yes. Emit one block per locale with distinct @ids (e.g., #website-en, #website-fr). Each locale's pages reference the matching WebSite @id. Set inLanguage correctly on each block.

Q: Does WebSite schema replace Organization schema?

No. WebSite describes the site; Organization describes the legal/brand entity. WebSite has a publisher property that resolves to an Organization @id. Both are typically present on the homepage.

Q: How does WebSite schema affect AI Overviews and Perplexity citations?

Indirectly. WebSite schema does not guarantee citations. It clarifies the domain's identity (name, alternateName, publisher), which AI engines use during entity resolution and citation attribution. Sites with consistent, validated WebSite schema have cleaner citation chips.

Q: Should the urlTemplate include https or http?

Use https. AI agents and modern crawlers expect HTTPS urlTemplates. Mixed-content urlTemplates can be silently dropped during validation.

Related Articles

specification

BreadcrumbList Schema Specification for AI Search Citation Context

BreadcrumbList schema specification: required fields, position ordering, and how AI engines use breadcrumb structured data to disambiguate citations.

specification

Image Sitemap Specification for Multimodal AI Citations

Image sitemap specification for multimodal AI citations: image:image markup, captions, license, geo-location, and signals AI engines extract for visual search.

specification

Organization Schema Specification for AI Brand Citations

Organization schema specification for AI brand citations: required fields, sameAs entity linking, logo, ContactPoint, and how LLMs verify brand identity.

Stay Updated

GEO & AI Search Insights

New articles, framework updates, and industry analysis. No spam, unsubscribe anytime.