RDFa for AI Search

RDFa is a W3C standard for embedding semantic data directly in HTML attributes; AI crawlers parse it but JSON-LD is the recommended default for new pages, leaving RDFa best suited to legacy enterprise, scientific, and government datasets that already invest in linked-data tooling.

TL;DR

RDFa works for AI search but is not the recommended default. Use JSON-LD for new properties. Keep RDFa where it is already integrated with linked-data systems (DBpedia, scientific publishing, government open-data portals) or where the team needs full RDF graph expressiveness that schema.org's JSON-LD profile does not cover.

Definition

RDFa (Resource Description Framework in Attributes) is a W3C Recommendation that adds attributes to HTML so that machine-readable data co-exists with human-readable content. RDFa Lite is a simplified subset that supports the most common cases with five attributes: vocab, typeof, property, resource, and prefix.

RDFa expresses subject-predicate-object triples directly in markup, which makes it a full RDF serialization. JSON-LD is also an RDF serialization, but it lives outside the visible HTML. Microdata is the third schema.org-compatible format and is the closest sibling to RDFa in DOM coupling.

Why RDFa Still Exists

RDFa predates JSON-LD by several years and has deep adoption in three communities. First, DBpedia and other linked-data projects extract RDFa from Wikipedia and structured web sources to build knowledge graphs. Second, scientific publishers use RDFa to encode bibliographic, dataset, and chemical entity metadata where the markup must travel with the rendered article. Third, government open-data portals frequently expose RDFa profiles so the same page is consumable as both a web page and an RDF dataset. AI engines that build their own knowledge graphs benefit when these sources keep their RDFa annotations clean.

How AI Crawlers Handle RDFa

Googlebot parses RDFa and exposes it in the Rich Results Test. Bingbot also parses RDFa; it powers structured-data extraction across Microsoft search surfaces. GPTBot, PerplexityBot, and ClaudeBot generally parse RDFa, but the parsing is more error-prone than JSON-LD because it requires a DOM walk and correct handling of vocab and prefix declarations. AI engines that consult third-party knowledge graphs (DBpedia, Wikidata) inherit data that originated as RDFa even when the AI engine itself parses JSON-LD on the live page.

The practical AI citation outcome is similar to microdata: RDFa works when validated cleanly, and AI citation rates are comparable to JSON-LD for the same content. The difference is maintenance cost and validation tooling, both of which favor JSON-LD.

Embedding Patterns

RDFa Lite Example

<article vocab="https://schema.org/" typeof="Article">
  <h1 property="headline">RDFa for AI Search</h1>
  <span property="author" typeof="Person">
    <span property="name">Jane Doe</span>
  </span>
  <time property="datePublished" datetime="2026-05-03">May 3, 2026</time>
  <div property="articleBody">…</div>
</article>

Full RDFa Example

<div prefix="schema: https://schema.org/ dc: http://purl.org/dc/terms/"
     typeof="schema:Dataset"
     resource="https://example.org/data/2026">
  <span property="schema:name dc:title">Open Climate Data 2026</span>
  <span property="schema:license" resource="https://creativecommons.org/licenses/by/4.0/">CC BY 4.0</span>
</div>

Full RDFa supports multiple vocabularies in one block by declaring prefix and using prefix:term in property. RDFa Lite does not.

AI Crawler Parser Support

Crawler	RDFa Lite	Full RDFa	Notes
Googlebot	Yes	Yes	Surfaces in Rich Results Test
Bingbot	Yes	Yes	Powers Microsoft AI surfaces
GPTBot	Yes	Partial	Reliable for schema.org vocabulary; multi-vocab less consistent
PerplexityBot	Yes	Partial	Same caveats as GPTBot
ClaudeBot	Yes	Partial	Same caveats as GPTBot
Google-Extended	Yes	Yes	Inherits Googlebot parsing

When RDFa Still Makes Sense

DBpedia ingestion. Pages whose primary downstream consumer is the linked-data community.
Scientific publishing. Articles annotated with bibliographic, dataset, or chemical entity vocabularies where the markup must remain attached to the rendered prose.
Government open data. Pages that double as RDF datasets for portal harvesting.
Existing schema.org RDFa profiles. Sites that already validate cleanly and would gain little from migration.

When to Migrate to JSON-LD

The site's structured data scope is limited to schema.org types covered by JSON-LD.
The team wants programmatic structured-data generation from typed models.
CI validation cost is high because RDFa requires DOM-aware tooling.
AI crawlers are missing markup on complex pages, indicated by gaps in Rich Results Test or in citation share for queries the page should win.

Migration Path

Inventory. Crawl and tag pages by vocabulary used (schema.org only vs multi-vocabulary).
Translate. Generate JSON-LD from the same source-of-truth model used to emit RDFa attributes.
Co-existence. Keep RDFa in place while serving JSON-LD; both are valid encodings.
Validation. Confirm no diff between extracted graphs through the Schema.org validator.
Deprecation. Remove RDFa attributes from templates only after JSON-LD has been stable for at least one full crawl cycle.

Validation

W3C RDFa Play for triple-by-triple inspection.
Google Rich Results Test for schema.org RDFa profiles.
Schema.org Validator for type checks.
Custom CI that runs an RDF parser against staged pages and asserts the expected triples exist.

Common Mistakes

Mixing vocab and prefix declarations inconsistently, producing invalid graphs.
Forgetting resource on the parent element, which causes orphan triples.
Using deprecated xmlns declarations from earlier RDFa profiles.
Maintaining RDFa alongside JSON-LD with conflicting values; choose one source of truth.
Treating RDFa as a SEO-only investment and leaving it broken for AI crawlers.

FAQ

Q: Do AI engines prefer RDFa or JSON-LD?

In practice JSON-LD, because parsing is deterministic. RDFa works but is more sensitive to DOM quirks. The output graph can be identical when both validate cleanly.

Q: Is RDFa deprecated?

No. RDFa remains a W3C Recommendation and is supported by major crawlers. The deprecation narrative comes from Google's recommendation of JSON-LD, not from a standards change.

Q: Can I use RDFa Lite instead of full RDFa?

Yes. RDFa Lite covers the cases needed for schema.org markup and is easier to author. Use full RDFa only when you need multiple vocabularies or full RDF graph features.

Q: Will migrating from RDFa to JSON-LD hurt AI citation rates?

In practice no. Citation rates typically maintain or improve after migration because JSON-LD is more reliably parsed. Validate carefully through the cutover.