Algorithmic statistics: forty years later
Nikolai Vereshchagin, Alexander Shen

TL;DR
This paper reviews the development of algorithmic statistics over forty years, exploring its philosophical and technical motivations, key results, and the relationship with Kolmogorov complexity.
Contribution
It provides a comprehensive exposition of the main results in algorithmic statistics, including proofs and historical context, connecting philosophical and information-theoretic perspectives.
Findings
Multiple definitions converge to similar curves characterizing data behavior
Existence of non-stochastic data with no good models
Deep connection between statistical models and Kolmogorov complexity
Abstract
Algorithmic statistics has two different (and almost orthogonal) motivations. From the philosophical point of view, it tries to formalize how the statistics works and why some statistical models are better than others. After this notion of a "good model" is introduced, a natural question arises: it is possible that for some piece of data there is no good model? If yes, how often these bad ("non-stochastic") data appear "in real life"? Another, more technical motivation comes from algorithmic information theory. In this theory a notion of complexity of a finite object (=amount of information in this object) is introduced; it assigns to every object some number, called its algorithmic complexity (or Kolmogorov complexity). Algorithmic statistic provides a more fine-grained classification: for each finite object some curve is defined that characterizes its behavior. It turns out that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputability, Logic, AI Algorithms · Benford’s Law and Fraud Detection · Statistical Mechanics and Entropy
