Quantifying Uncertainty in AI Visibility: A Statistical Framework for Generative Search Measurement
Ronald Sielinski

TL;DR
This paper highlights the stochastic nature of AI-generated responses and proposes a statistical framework to measure and report the uncertainty in domain visibility metrics within generative search platforms.
Contribution
It introduces a novel approach to quantify and incorporate uncertainty in citation visibility metrics, emphasizing the importance of confidence intervals and variability analysis.
Findings
Citation distributions follow a power-law pattern.
Significant variability exists in citation metrics across repeated samples.
Rankings are unstable and sensitive to sampling noise.
Abstract
AI-powered answer engines are inherently non-deterministic: identical queries submitted at different times can produce different responses and cite different sources. Despite this stochastic behavior, current approaches to measuring domain visibility in generative search typically rely on single-run point estimates of citation share and prevalence, implicitly treating them as fixed values. This paper argues that citation visibility metrics should be treated as sample estimators of an underlying response distribution rather than fixed values. We conduct an empirical study of citation variability across three generative search platforms--Perplexity Search, OpenAI SearchGPT, and Google Gemini--using repeated sampling across three consumer product topics. Two sampling regimes are employed: daily collections over nine days and high-frequency sampling at ten-minute intervals. We show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Expert finding and Q&A systems · Ethics and Social Impacts of AI
