GEO Citation Decay Tracking Framework

A citation-decay tracking framework treats each AI citation as an instrumented unit, fits a survival curve to citation cohorts, sets refresh thresholds at named points on the curve (typically the empirical half-life and the 75th-percentile decay point), and routes every decaying citation through a refresh, revive, or retire decision before it disappears.

TL;DR

Most decay programs measure pages. This framework measures citations. Group every citation your tracker observes into weekly cohorts, fit a survival curve, derive an empirical half-life per query type and per engine, and trigger refresh actions when a citation's age crosses defined thresholds on that curve. Distributed and well-syndicated sources persist roughly twice as long as single-source pages in industry analyses, so distribution belongs inside the framework, not next to it.

Why per-citation decay matters

Industry analyses converge on a striking pattern: AI citations are short-lived. One published study of 3.5 million citation events between September 2025 and March 2026 estimated an average citation half-life of four to five weeks. Another longitudinal dataset of more than 108,000 distinct citation URLs found that roughly 73.5% appeared exactly once before disappearing. A separate analysis observed that pages updated within the last two months earned about 28% more AI citations than older pages on the same topic, and that ChatGPT in particular tends to cite very recently updated sources.

Page-level decay frameworks (rewrite when traffic drops 20-30% over a quarter) miss this entirely. By the time a page-level signal triggers, the citation has already been gone for weeks. The unit of measurement has to shift to the citation.

Framework overview

The framework has four components:

Cohort & survival: bin every observed citation into a weekly cohort and fit a survival curve.
Per-citation tracking: instrument each cited URL with stable IDs, age, freshness, and source-distribution attributes.
Refresh-trigger thresholds: define named decay-curve positions that trigger action.
Decision tree: for each triggered citation, choose refresh, revive, or retire.

1. Cohort definition and survival curve

A citation cohort is the set of distinct citation URLs first observed in a given calendar week, segmented by:

Engine (ChatGPT, Perplexity, Claude, Google AI Overviews, Gemini)
Query class (branded, non-branded, competitor-branded, ambiguous)
Intent bucket (informational, navigational, commercial, transactional)
Source type (owned canonical, owned syndication, third-party, reference work, community thread)

For each cohort, run weekly counts of how many cohort members are still cited in subsequent weeks. The empirical half-life is the number of weeks until the cohort drops to 50% of its initial size. Industry baselines suggest 4-5 weeks for the average source, but treat that as a starting prior, not a target. Recompute per cohort.

Keep three named points on every fitted curve:

t₅₀ — the half-life (50% surviving). Trigger a freshness review.
t₇₅ — the 75th-percentile decay point (25% surviving). Trigger a refresh decision.
t₉₀ — the 90th-percentile decay point (10% surviving). Trigger retire-or-revive.

2. Per-citation tracking schema

For each citation, store at minimum:

citation_id (stable hash of normalized URL + first-observed date)
url, engine, cohort_week, first_seen, last_seen
query_class, intent, source_type, cohort_id
refresh_state (fresh | aging | decayed | revived | retired)
last_publish_date, last_update_date, last_refresh_date
distribution (count of distinct domains carrying syndicated/cross-posted versions)
linked_mentions_30d, unlinked_mentions_30d

The distribution field is load-bearing. Industry research has observed that distributed content (syndication, cross-posting on partner publications, mirrored repositories) persists in LLM responses roughly twice as long as single-source pages.

3. Refresh-trigger thresholds

Thresholds bind to decay-curve position rather than absolute calendar age. This keeps triggers stable as the underlying engines change:

At t₅₀ (half-life): queue a freshness review. Light-touch: confirm dates, refresh data points, validate links, refresh the AI summary blockquote.
At t₇₅: queue a substantive refresh. Add a new section, re-cite primary sources, rewrite the AI summary, refresh examples, increment version.
At t₉₀: queue a revive-or-retire decision. Either invest heavily in a new angle or remove from the citation surface and consolidate authority into a sibling page.

Layer two performance triggers on top:

A 20-30% citation-volume drop over a rolling three-week window for a single citation forces immediate review regardless of curve position.
A volatility spike (see GEO citation volatility tracking) escalates the next scheduled action by one tier.

4. Refresh vs. revive vs. retire decision tree

For each triggered citation, route through:

Refresh if: query intent is unchanged, the page still answers it, citation volume is non-zero, and a freshness pass plausibly restores citations. Default action at t₅₀ and t₇₅.
Revive if: query intent has shifted, a competitor has published a meaningfully better answer, OR the topic has bifurcated. A revive is closer to a rewrite—new H1 candidate, new entities, new examples, optionally a new canonical_concept_id with a 301 from the old URL.
Retire if: citation volume is zero across a full t₉₀ window, the query is consolidating into a sibling page, or the source-type signal indicates the page never had structural fit (e.g. a thin landing page that briefly absorbed citations during a news spike). Retiring means removing from llms.txt, consolidating links into a stronger sibling, and 301-ing if user-traffic warrants.

Editorial calendar integration

Wire the framework's queue into the editorial calendar:

t₅₀ freshness reviews: capacity-planned, not strictly scheduled. Aim for a sustainable weekly batch.
t₇₅ substantive refreshes: scheduled like a release. Each gets a sized ticket with a writer.
t₉₀ revive/retire decisions: reviewed weekly by the GEO program owner. Default to retire unless a clear rewrite case exists—a sprawling under-maintained surface decays the rest of the program.

Common implementation mistakes

Treating page age as a substitute for citation age. They diverge sharply.
Refreshing on a flat calendar ("every six months") rather than on decay-curve position.
Ignoring distribution. A single-source page is half as persistent as the same content syndicated to two reputable mirrors.
Conflating volatility (noise) with decay (trend). Volatility may resolve without action; decay does not.
Refreshing without incrementing version. AI engines treat metadata stillness as staleness.
Allowing retired citations to remain in llms.txt and sitemap.xml. They keep getting fetched and earning low confidence scores.

FAQ

Q: Should we use the published 4-5 week average as our half-life baseline?

Use it as a prior only. Refit per cohort. Half-life varies materially by engine (Perplexity, ChatGPT, and Google AI Overviews have all reported different decay curves) and by source type. The framework's value comes from your fitted curves, not from a global average.

Q: How does this differ from page-level content-decay frameworks?

Page-level frameworks act when traffic drops. This framework acts when citation survival drops, which leads the traffic signal by weeks. The two are complementary; this one is faster and AI-specific.

Q: What if a citation has zero distribution but is still cited?

Keep it. Distribution lengthens persistence on average; a single-source citation with high authority can still survive. Use the framework's per-citation trigger, not the distribution attribute alone.

Q: How small can a cohort be before survival curves become unreliable?

For stable curve fits, aim for at least ~200 distinct citations per cohort. Below that, fall back to a coarser cohort (engine × query class only) and accept wider confidence bands.

Q: Does retiring a URL hurt overall GEO performance?

No, when done well. Retiring a decayed citation that has stopped earning recommendations and consolidating its content into a stronger sibling typically improves the sibling's citation rate. The hazard is retiring a still-citable URL on calendar age alone.