AI Citation Share Dashboard Framework: Tracking Share of Voice Across AI Engines
An AI citation share dashboard measures how often your brand is cited or recommended across ChatGPT, Perplexity, Gemini, Copilot, and Google AI Overviews relative to competitors. The framework combines six metrics (citation rate, share of voice, recommendation rate, prompt-level win rate, sentiment, AI referral traffic) sampled across a stable prompt panel on a weekly cadence, then visualized in Looker Studio or Google Sheets with competitor benchmarks.
TL;DR
A usable generative engine optimization (GEO) dashboard does three things: it samples a fixed prompt panel across multiple AI engines on a regular cadence, it normalizes citation and mention data into share-of-voice metrics, and it pairs those metrics with downstream AI referral traffic. This framework gives you the prompt panel design, the six core KPIs, sampling cadence, the data model, and a Looker Studio / Sheets layout you can ship in a week.
Why a citation share dashboard
AI search behavior is volatile. AirOps reports that only 30% of brands stay visible from one AI answer to the next, and just 20% remain visible across five consecutive runs of the same query. Without a dashboard, point-in-time prompt tests give you a misleading picture of visibility.
A dashboard solves three operational problems:
- Stability. Repeated sampling smooths volatility into a trend you can act on.
- Comparability. Share-of-voice math lets you compare your brand against named competitors instead of arguing about absolute counts.
- Attribution. Linking citations to AI referral traffic and pipeline shows whether visibility actually moves revenue.
Six core metrics
This framework standardizes on six KPIs that map cleanly onto GEO goals. Track all six per engine and per prompt segment.
- Citation rate — percent of sampled prompts where your domain is cited as a source. Primary leading indicator.
- Share of voice (SoV) — your citations divided by the total citations of you plus a fixed competitor set. Always normalize.
- Recommendation rate — percent of prompts where the AI explicitly recommends your product or service (mention plus positive intent), not just cites you.
- Prompt-level win rate — percent of head-to-head prompts where you appear above (or instead of) a named competitor.
- Sentiment — sentiment score of the AI's framing of your brand on a −1 to +1 scale.
- AI referral traffic — sessions, conversions, and pipeline attributed to AI referrers (chatgpt.com, perplexity.ai, gemini.google.com, copilot.microsoft.com) in your analytics tool.
Metrics 1-5 come from prompt sampling; metric 6 comes from your analytics platform. They must live in the same dashboard so a single change — say, a content refresh — can be traced across all of them.
Step 1: Build a stable prompt panel
The prompt panel is the most important asset in the framework. Treat it like a benchmark: design it once, change it rarely, version it.
- Size: 60-150 prompts. Smaller panels are too noisy; larger panels become expensive to sample weekly.
- Coverage: split across three intents — informational ("what is X"), comparative ("X vs Y"), and evaluative ("best X for Y"). Recommendation behavior differs sharply across these.
- Segments: tag each prompt with funnel stage (awareness, consideration, decision), category (product, problem, ecosystem), and competitor set.
- Versioning: lock the panel in a spreadsheet with a prompt_id, an intent, a segment, and a created_at. Any change creates a new version; never edit prompts in place.
A worked panel for a B2B SaaS in the GEO category might include 30 informational prompts ("what is generative engine optimization"), 30 comparative ("Profound vs Evertune vs AthenaHQ"), and 30 evaluative ("best AI citation tracking tool for enterprise").
Step 2: Choose engines and sampling cadence
Cover the engines that drive real downstream behavior:
- Tier 1 (weekly): ChatGPT, Perplexity, Google AI Overviews.
- Tier 2 (bi-weekly): Gemini direct, Microsoft Copilot.
- Tier 3 (monthly): Claude, You.com, Andi, niche industry engines.
For each engine, run each prompt at least three times per cycle and aggregate. Single runs are unreliable for volatile engines like AI Overviews. Where the engine has an API (Perplexity Sonar, You.com Search/Research), automate. Where it does not, use a managed tracker (Profound, Evertune, AirOps, Nightwatch, Frase, AI Rank Lab) or a scripted browser harness.
Step 3: Define the data model
Keep one row per (prompt, engine, run, citation). The fact table looks like this:
| Column | Type | Notes |
|---|---|---|
| sample_at | timestamp | When the prompt was issued |
| prompt_id | string | Joins to prompt panel |
| engine | string | chatgpt, perplexity, gemini, copilot, aio, etc. |
| run_index | int | 1..N for the cycle |
| domain | string | Lower-cased citation domain |
| is_brand | bool | Yours or a competitor (resolve via lookup) |
| position | int | 1-based citation order in the answer |
| mention_type | enum | citation, named_mention, recommendation |
| sentiment | float | −1..+1 from a sentiment scoring step |
| answer_excerpt | text | First 500 chars of the AI answer |
| answer_url | string | Permalink if available |
Derive the six KPIs as views on top of this fact table. Never aggregate at insert time — aggregations should be reproducible from raw runs.
Step 4: Wire AI referral traffic
In GA4 (or your analytics tool), create a custom channel group called AI Search that includes referrers from chatgpt.com, perplexity.ai, gemini.google.com, copilot.microsoft.com, you.com, and known UTM patterns. Surface sessions, engaged sessions, conversions, and pipeline value per source. Walker Sands and HubSpot both highlight AI referral traffic and pipeline impact as the GEO metrics that survive executive scrutiny, so they belong on page one of the dashboard.
Step 5: Build the dashboard layout
A single Looker Studio (or Sheets) report works. Use four pages.
Page 1: Executive summary
- KPI tiles: Citation rate, SoV, Recommendation rate, AI referral sessions, AI referral pipeline.
- Trend chart: SoV by week across Tier 1 engines for the last 13 weeks.
- Headline competitor table: SoV per competitor, sorted descending.
Page 2: Engine drilldown
- One section per engine with citation rate, SoV, win rate, sentiment.
- Top cited domains within each engine (helps identify PR and partnership targets).
- Volatility indicator: percentage of prompts where citations changed between runs.
Page 3: Prompt segment view
- KPIs split by intent (informational / comparative / evaluative).
- Funnel-stage view to surface where your visibility weakens between awareness and decision.
- Worst-performing segments table feeding the content backlog.
Page 4: Content attribution
- For each cited URL, show citations earned per week, AI referral sessions, and last update date.
- Highlight pages cited often but rarely updated — they are the highest-leverage refresh candidates.
- Link back to the Content Freshness Signals for AI Search checklist.
Step 6: Sampling discipline
A dashboard is only as honest as its sampling. Three rules:
- Fix the panel before measuring change. Adding prompts retroactively biases trends.
- Sample at the same time of week and day. AI Overviews and ChatGPT search have meaningful diurnal variation.
- Resolve competitor brands centrally. Maintain a brand alias dictionary (e.g., you.com, you-com, You) so SoV math is correct.
Step 7: Trigger the operating cadence
The dashboard is the input to a weekly operating review:
- Monday: sampling job runs.
- Tuesday morning: content lead reviews segment view, files refresh tickets for the bottom 10% segments.
- Wednesday: PR / comms review top-cited third-party domains (Reddit, YouTube, Gartner) for outreach opportunities.
- Friday: GEO lead presents page-one summary in the marketing standup.
Without a cadence, the dashboard becomes a museum. With one, it becomes the operating system for the GEO program.
Common pitfalls
- Tracking absolute mention counts instead of SoV. Counts go up when AI engines simply talk more; SoV controls for that.
- Ignoring sentiment. A negative recommendation is worse than no mention. Always carry sentiment alongside frequency.
- One-shot prompt sampling. Visibility volatility (≆70% on AI Overviews per industry analyses) means a single sample is not a measurement.
- Not attaching analytics. A dashboard that never connects to revenue gets defunded.
FAQ
Q: How many prompts do I need to start?
Start with 60 prompts split 20 / 20 / 20 across informational, comparative, and evaluative intents. That's enough to detect meaningful SoV shifts week over week without overwhelming your sampling budget.
Q: Should I build this in-house or buy a tool?
Buy a tracker (Profound, Evertune, AirOps, AI Rank Lab, Nightwatch, Frase) for the sampling layer if you don't already have engineering bandwidth, then own the dashboard layer in Looker Studio so the data model fits your other marketing reporting. Hybrid is the most common pattern.
Q: What is a healthy citation rate?
Industry write-ups (Averi, AirOps) cite 20-30% citation rate as the threshold of meaningful AI visibility for SaaS leaders, with most startups starting near 0%. Treat the threshold as directional and benchmark against your specific competitor set rather than industry averages.
Q: How often should I refresh the prompt panel?
Lock the panel for at least one quarter so trends are interpretable. Add a small "experimental" sub-panel (5-10 prompts) you rotate monthly for emerging topics; never edit the locked core.
Q: How do I handle AI Overviews variance?
Run each AIO prompt three to five times per cycle and aggregate. Note the volatility in the dashboard so stakeholders interpret week-over-week movement correctly. AIO content reportedly changes ≆70% of the time, so multiple samples are non-negotiable.
Related Articles
Article Schema Markup Checklist for AI Search Engines
Article schema markup checklist for AI search: 30 fields LLM crawlers consume to surface citations on ChatGPT, Perplexity, and AI Overviews.
Vector Embedding Optimization for AI Search Citations
Vector embedding optimization for AI citations: how chunking, density, and semantic clarity influence retrieval in RAG-powered LLM search engines.
Perplexity vs You.com vs Andi: AI Search Engines Compared in 2026
Perplexity vs You vs Andi: feature-by-feature AI search engine comparison covering citation styles, query handling, and GEO implications in 2026.