RDFa for AI Search
RDFa is a W3C standard for embedding semantic data directly in HTML attributes; AI crawlers parse it but JSON-LD is the recommended default for new pages, leaving RDFa best suited to legacy enterprise, scientific, and government datasets that already invest in linked-data tooling.
TL;DR
RDFa works for AI search but is not the recommended default. Use JSON-LD for new properties. Keep RDFa where it is already integrated with linked-data systems (DBpedia, scientific publishing, government open-data portals) or where the team needs full RDF graph expressiveness that schema.org's JSON-LD profile does not cover.
Definition
RDFa (Resource Description Framework in Attributes) is a W3C Recommendation that adds attributes to HTML so that machine-readable data co-exists with human-readable content. RDFa Lite is a simplified subset that supports the most common cases with five attributes: vocab, typeof, property, resource, and prefix.
RDFa expresses subject-predicate-object triples directly in markup, which makes it a full RDF serialization. JSON-LD is also an RDF serialization, but it lives outside the visible HTML. Microdata is the third schema.org-compatible format and is the closest sibling to RDFa in DOM coupling.
Why RDFa Still Exists
RDFa predates JSON-LD by several years and has deep adoption in three communities. First, DBpedia and other linked-data projects extract RDFa from Wikipedia and structured web sources to build knowledge graphs. Second, scientific publishers use RDFa to encode bibliographic, dataset, and chemical entity metadata where the markup must travel with the rendered article. Third, government open-data portals frequently expose RDFa profiles so the same page is consumable as both a web page and an RDF dataset. AI engines that build their own knowledge graphs benefit when these sources keep their RDFa annotations clean.
How AI Crawlers Handle RDFa
Googlebot parses RDFa and exposes it in the Rich Results Test. Bingbot also parses RDFa; it powers structured-data extraction across Microsoft search surfaces. GPTBot, PerplexityBot, and ClaudeBot generally parse RDFa, but the parsing is more error-prone than JSON-LD because it requires a DOM walk and correct handling of vocab and prefix declarations. AI engines that consult third-party knowledge graphs (DBpedia, Wikidata) inherit data that originated as RDFa even when the AI engine itself parses JSON-LD on the live page.
The practical AI citation outcome is similar to microdata: RDFa works when validated cleanly, and AI citation rates are comparable to JSON-LD for the same content. The difference is maintenance cost and validation tooling, both of which favor JSON-LD.
Embedding Patterns
RDFa Lite Example
<article vocab="https://schema.org/" typeof="Article">
<h1 property="headline">RDFa for AI Search</h1>
<span property="author" typeof="Person">
<span property="name">Jane Doe</span>
</span>
<time property="datePublished" datetime="2026-05-03">May 3, 2026</time>
<div property="articleBody">…</div>
</article>Full RDFa Example
<div prefix="schema: https://schema.org/ dc: http://purl.org/dc/terms/"
typeof="schema:Dataset"
resource="https://example.org/data/2026">
<span property="schema:name dc:title">Open Climate Data 2026</span>
<span property="schema:license" resource="https://creativecommons.org/licenses/by/4.0/">CC BY 4.0</span>
</div>Full RDFa supports multiple vocabularies in one block by declaring prefix and using prefix:term in property. RDFa Lite does not.
AI Crawler Parser Support
| Crawler | RDFa Lite | Full RDFa | Notes |
|---|---|---|---|
| Googlebot | Yes | Yes | Surfaces in Rich Results Test |
| Bingbot | Yes | Yes | Powers Microsoft AI surfaces |
| GPTBot | Yes | Partial | Reliable for schema.org vocabulary; multi-vocab less consistent |
| PerplexityBot | Yes | Partial | Same caveats as GPTBot |
| ClaudeBot | Yes | Partial | Same caveats as GPTBot |
| Google-Extended | Yes | Yes | Inherits Googlebot parsing |
When RDFa Still Makes Sense
- DBpedia ingestion. Pages whose primary downstream consumer is the linked-data community.
- Scientific publishing. Articles annotated with bibliographic, dataset, or chemical entity vocabularies where the markup must remain attached to the rendered prose.
- Government open data. Pages that double as RDF datasets for portal harvesting.
- Existing schema.org RDFa profiles. Sites that already validate cleanly and would gain little from migration.
When to Migrate to JSON-LD
- The site's structured data scope is limited to schema.org types covered by JSON-LD.
- The team wants programmatic structured-data generation from typed models.
- CI validation cost is high because RDFa requires DOM-aware tooling.
- AI crawlers are missing markup on complex pages, indicated by gaps in Rich Results Test or in citation share for queries the page should win.
Migration Path
- Inventory. Crawl and tag pages by vocabulary used (schema.org only vs multi-vocabulary).
- Translate. Generate JSON-LD from the same source-of-truth model used to emit RDFa attributes.
- Co-existence. Keep RDFa in place while serving JSON-LD; both are valid encodings.
- Validation. Confirm no diff between extracted graphs through the Schema.org validator.
- Deprecation. Remove RDFa attributes from templates only after JSON-LD has been stable for at least one full crawl cycle.
Validation
- W3C RDFa Play for triple-by-triple inspection.
- Google Rich Results Test for schema.org RDFa profiles.
- Schema.org Validator for type checks.
- Custom CI that runs an RDF parser against staged pages and asserts the expected triples exist.
Common Mistakes
- Mixing vocab and prefix declarations inconsistently, producing invalid graphs.
- Forgetting resource on the parent element, which causes orphan triples.
- Using deprecated xmlns declarations from earlier RDFa profiles.
- Maintaining RDFa alongside JSON-LD with conflicting values; choose one source of truth.
- Treating RDFa as a SEO-only investment and leaving it broken for AI crawlers.
FAQ
Q: Do AI engines prefer RDFa or JSON-LD?
In practice JSON-LD, because parsing is deterministic. RDFa works but is more sensitive to DOM quirks. The output graph can be identical when both validate cleanly.
Q: Is RDFa deprecated?
No. RDFa remains a W3C Recommendation and is supported by major crawlers. The deprecation narrative comes from Google's recommendation of JSON-LD, not from a standards change.
Q: Can I use RDFa Lite instead of full RDFa?
Yes. RDFa Lite covers the cases needed for schema.org markup and is easier to author. Use full RDFa only when you need multiple vocabularies or full RDF graph features.
Q: Will migrating from RDFa to JSON-LD hurt AI citation rates?
In practice no. Citation rates typically maintain or improve after migration because JSON-LD is more reliably parsed. Validate carefully through the cutover.
Related Articles
FAQPage Schema for AI Citations
Specification for FAQPage schema markup optimized for AI citations: properties, validation rules, character limits, and post-rich-result-deprecation patterns.
Microdata vs JSON-LD for AI Search
Side-by-side comparison of microdata vs JSON-LD for AI search: parser support, ergonomics, validation, performance, and migration recommendations.
Structured Data for AI Search
How to implement structured data (JSON-LD / Schema.org) to improve AI search visibility. Covers TechArticle, FAQPage, HowTo, and entity definitions.