The Probabilistic Foundations of Surveillance Failure: From False Alerts to Structural Bias
Marco Pollanen

TL;DR
This paper reveals fundamental probabilistic limits in high-dimensional surveillance systems, showing that increasing data scale and correlations can cause reliable detection to become fundamentally unreliable, leading to structural biases.
Contribution
It introduces a probabilistic framework explaining why high-dimensional screening inherently leads to false alerts and structural biases, beyond implementation flaws.
Findings
High-dimensional screening causes false alerts to become almost certain at scale.
Small increases in data or correlations sharply degrade system reliability.
Structural biases are mathematically inevitable beyond a critical data scale.
Abstract
For decades, forensic statisticians have debated whether searching large DNA databases undermines the evidential value of a match. Modern surveillance faces an exponentially harder problem: screening populations across thousands of attributes using threshold rules rather than exact matching. Intuition suggests that requiring many coincidental matches should make false alerts astronomically unlikely. This intuition fails. Consider a system that monitors 1,000 attributes, each with a 0.5 percent innocent match rate. Matching 15 pre-specified attributes has probability \(10^{-35}\), one in 30 decillion, effectively impossible. But operational systems require no such specificity. They might flag anyone who matches \emph{any} 15 of the 1,000. In a city of one million innocent people, this produces about 226 false alerts. A seemingly impossible event becomes all but guaranteed. This is not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
