AI Citation Monitoring Tool Buyer's Checklist: 30 Criteria for Evaluating Profound, Otterly, and Optiview in 2026

Use this 30-criterion weighted scorecard to evaluate AI citation monitoring vendors (Profound, Otterly, Optiview, Nightwatch, Peec) across engine coverage, citation intelligence, integrations, governance, and ROI fit before signing a contract.

TL;DR

Most AI citation monitoring vendors look identical in marketing copy. This 30-point checklist forces a structured comparison across six categories — engine coverage, citation intelligence, prompt research, action layer, integrations, and governance — with weights and pass/fail thresholds. Score each vendor 0-3 per criterion and pick the highest weighted total, not the loudest demo.

How to use this checklist

Score each criterion on a 0-3 scale: 0 = missing, 1 = partial, 2 = solid, 3 = best-in-class. Multiply by the listed weight to get a category sub-total. Sum all six categories for a total out of 300. Anything under 180 is a miss; 180-225 is acceptable; 225+ is a strong fit. Apply a hard pass/fail on three non-negotiable items at the end.

For a deeper architectural decision (build vs buy vs hybrid), pair this checklist with our citation monitoring stack selection framework. For broader tooling context, see the tools hub and AI visibility tracking tools guide.

Category 1 — Engine coverage (weight ×3)

Coverage of the top five engines. ChatGPT, Perplexity, Google AI Overviews, Gemini, and Microsoft Copilot must all be tracked natively, not via screenshots or manual prompts.
Long-tail engine coverage. Claude, Grok, Meta AI, DeepSeek, You.com, and Brave Leo. At least three of these should be supported.
Geographic and language coverage. Multi-locale prompting (US, EU, APAC) and at least five languages, since AI engines route by region and language.
Browser-mode coverage. Tracks ChatGPT Atlas, Comet, Arc, and Brave Leo agentic browsers — not only the chat surface.
Refresh cadence. Daily refresh as default, with on-demand re-runs after content publishes; weekly-only platforms are no longer acceptable in 2026.

Category 2 — Citation intelligence (weight ×3)

Source URL extraction. Captures the exact URL the engine cited, not just whether your brand was mentioned. Surfaced in AI visibility platform comparisons as the single biggest vendor differentiator.
Citation vs mention separation. Distinguishes hyperlinked citations from synthesized brand mentions. See our direct citation vs synthesized mention reference.
Share of voice math. Reports (your citations / total citations) × 100 per prompt cluster, normalized across engines.
Sentiment and stance. Classifies whether AI describes you positively, neutrally, or negatively at the citation level — not just thread level.
Competitor co-citation graphs. Shows which competitors are cited together with you and at what rate.
Hallucination detection. Flags incorrect product features, pricing, or capabilities that the AI invented about your brand.

Category 3 — Prompt research and intent (weight ×2)

Prompt volume estimates. Reports approximate query volume per prompt (Profound's Prompt Volumes is the current bar).
Topic clustering. Groups prompts into intent clusters automatically; allows manual override.
Question discovery. Suggests new prompts your audience asks but you don't track yet.
Funnel-stage tagging. Labels prompts as awareness, evaluation, or decision so you can prioritize bottom-funnel coverage.

Category 4 — Action layer (weight ×3)

Page-to-prompt mapping. Tells you which of your URLs are getting cited for which prompts.
Optimization recommendations. Suggests concrete content changes (FAQ additions, structured data, link sources).
Brief generation. Produces a content brief tied to a citation gap with target prompts and competitor sources to displace.
Crawler and indexability checks. Verifies GPTBot, ClaudeBot, PerplexityBot, and Googlebot can access cited URLs; flags 4xx, 5xx, and JS-only renders.
Real-time alerts. Slack, email, or webhook alerts when citation share drops more than a configurable threshold.

Category 5 — Integrations and data portability (weight ×2)

API access. REST or GraphQL with auth, pagination, and rate limits documented; no API is a hard fail for enterprise.
CSV / data warehouse export. Native exports to BigQuery, Snowflake, or S3 — or Looker/Tableau connectors. See our AI visibility report schema spec.
CMS integrations. Push briefs and findings into Notion, Contentful, WordPress, or Sanity.
GSC and analytics joins. Cross-references AI citations with Google Search Console clicks and GA4 / Amplitude events to estimate revenue lift.
MCP / agent-readiness. Exposes data via Model Context Protocol tools so internal agents can query citation history.

Category 6 — Governance, pricing, and trust (weight ×2)

SOC 2 Type II + GDPR. Required for any procurement-controlled buyer. HIPAA if you're in healthcare.
Transparent pricing. Public starting tier with a clear per-prompt or per-domain unit; "contact us" as the only option is a yellow flag.
Audit trail and roles. RBAC, SSO, and immutable audit logs for prompt and dashboard changes.
Methodology disclosure. Documents how prompts are sampled, how sentiment is classified, and how citation parsing works — vendors that won't disclose are guessing.
Reference customers in your segment. At least two named customers in your industry and company size willing to take a reference call.

Hard pass / fail gates (apply after scoring)

Engine coverage gate. If criteria 1-2 score below 4/6 combined, eliminate the vendor regardless of total.
Citation intelligence gate. If criterion 6 (source URL extraction) is 0, eliminate — you are paying for vibes, not data.
Governance gate. If criterion 26 (SOC 2 Type II) is 0 and you sell into regulated buyers, eliminate.

Scoring template (copy into a sheet)

Category	Max	Weight
Engine coverage (Q1-5)	15	×3
Citation intelligence (Q6-11)	18	×3
Prompt research (Q12-15)	12	×2
Action layer (Q16-20)	15	×3
Integrations (Q21-25)	15	×2
Governance (Q26-30)	15	×2
Weighted total	300

Vendor quick read (April 2026)

Profound — strongest enterprise governance, agent analytics, and prompt volumes; expensive; weakest on transparent self-serve pricing.
Otterly.AI — solid mid-market self-serve with explicit citation link analysis; lighter on action layer.
Nightwatch — bundles AI tracking with classic SEO and prompt research; best for SEO-led teams that want one tool.
Peec AI — clean prompt-tracking foundation; thin on engine coverage and recommendations.
Optiview — newer entrant focused on action-layer briefs and crawler diagnostics.

Vendor positions move quarterly; re-score every 6 months.

FAQ

Q: How many criteria should a vendor pass before I buy?

Aim for at least 24 of 30 criteria scoring 2 or higher, all three hard gates passing, and a weighted total above 225. Below that, you are paying for a dashboard, not a decision system.

Q: Do I need to track all six top engines from day one?

No. Start with the top three engines that drive your category traffic (typically ChatGPT, Perplexity, Google AI Overviews) and require the vendor to add the others within 90 days as a contractual milestone.

Q: Is build-your-own ever cheaper than buying?

For most teams under 50 prompts, no. Above ~500 tracked prompts with custom engines, a hybrid stack often wins on unit cost. Use our citation monitoring stack selection framework to decide.

Q: How often should I re-run the buyer's checklist?

Re-score incumbent vendors every two quarters or after any major engine launch (for example, a new agentic browser). Vendor capabilities in this category drift faster than annual review cycles can capture.

Q: What is the single best signal a vendor is overselling?

They cannot show you the literal source URL the AI engine cited. If demo data is limited to brand-mention counts, criterion 6 fails and the rest of the scorecard rarely recovers it.