A Lower Bound on the Complexity of Approximating the Entropy of a Markov   Source

Travis Gagie

arXiv:0912.5079·cs.IT·December 31, 2009

A Lower Bound on the Complexity of Approximating the Entropy of a Markov Source

Travis Gagie

PDF

Open Access

TL;DR

This paper establishes a fundamental lower bound on the number of samples needed to approximate the entropy of a Markov source, showing that certain entropy levels are inherently hard to distinguish with limited data.

Contribution

It proves a lower bound on the sample complexity for entropy approximation of Markov sources, highlighting fundamental limitations in the field.

Findings

01

No algorithm can reliably distinguish entropy levels with fewer than (\sigma - k)^{k/2 - ext{epsilon}} samples.

02

The lower bound applies even when the entropy is either 0 or at least \log (\sigma - k).

03

Sample complexity grows exponentially with the order of the Markov source.

Abstract

Suppose that, for any (k \geq 1), (\epsilon > 0) and sufficiently large $σ$ , we are given a black box that allows us to sample characters from a $k$ th-order Markov source over the alphabet (\{0, ..., \sigma - 1\}). Even if we know the source has entropy either 0 or at least (\log (\sigma - k)), there is still no algorithm that, with probability bounded away from (1 / 2), guesses the entropy correctly after sampling at most ((\sigma - k)^{k / 2 - \epsilon}) characters.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · semigroups and automata theory · DNA and Biological Computing