On Empirical Entropy
Paul M.B. Vit\'anyi (CWI, University of Amsterdam)

TL;DR
This paper introduces a new compression-based measure of empirical entropy for finite strings, comparing it with existing similarity metrics to reveal their relationships and differences.
Contribution
It proposes a novel compression-based empirical entropy measure and analyzes its relation to the Normalized Information Distance and Mutual Information.
Findings
The new entropy measure aligns with traditional entropy for computable distributions.
Comparison shows similarities and differences between the entropy-based distance and mutual information.
The approach highlights the computability assumptions in entropy and distance measures.
Abstract
We propose a compression-based version of the empirical entropy of a finite string over a finite alphabet. Whereas previously one considers the naked entropy of (possibly higher order) Markov processes, we consider the sum of the description of the random variable involved plus the entropy it induces. We assume only that the distribution involved is computable. To test the new notion we compare the Normalized Information Distance (the similarity metric) with a related measure based on Mutual Information in Shannon's framework. This way the similarities and differences of the last two concepts are exposed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputability, Logic, AI Algorithms · Fractal and DNA sequence analysis · Algorithms and Data Compression
