Algorithmic statistics revisited

Nikolay Vereshchagin; Alexander Shen

arXiv:1504.04950·cs.IT·April 28, 2015·1 cites

Algorithmic statistics revisited

Nikolay Vereshchagin, Alexander Shen

PDF

Open Access

TL;DR

This paper revisits algorithmic statistics by exploring the concept of stochasticity profiles, which balance model complexity and data adequacy using notions from algorithmic information theory.

Contribution

It provides a comprehensive survey of multiple equivalent definitions of stochasticity profiles and their interrelations in the context of algorithmic statistics.

Findings

01

Stochasticity profiles can be characterized in four equivalent ways.

02

The survey links randomness deficiency, description length, string lists, and Kolmogorov complexity.

03

The paper clarifies the theoretical foundations of model adequacy in algorithmic statistics.

Abstract

The mission of statistics is to provide adequate statistical hypotheses (models) for observed data. But what is an "adequate" model? To answer this question, one needs to use the notions of algorithmic information theory. It turns out that for every data string $x$ one can naturally define "stochasticity profile", a curve that represents a trade-off between complexity of a model and its adequacy. This curve has four different equivalent definitions in terms of (1)~randomness deficiency, (2)~minimal description length, (3)~position in the lists of simple strings and (4)~Kolmogorov complexity with decompression time bounded by busy beaver function. We present a survey of the corresponding definitions and results relating them to each other.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputability, Logic, AI Algorithms · Algorithms and Data Compression · Benford’s Law and Fraud Detection