On Probability Estimation via Relative Frequencies and Discount
Christopher Mattern

TL;DR
This paper analyzes a probability estimation algorithm based on relative frequencies and discounting, providing theoretical guarantees on its efficiency in data compression, and explaining the empirical recency effect.
Contribution
It introduces Algorithm RFD and offers the first theoretical analysis demonstrating its effectiveness in various probabilistic models.
Findings
Code length remains small under piecewise stationary models.
Theoretical confirmation of the empirical recency effect.
Supports practical use in data compression algorithms.
Abstract
Probability estimation is an elementary building block of every statistical data compression algorithm. In practice probability estimation is often based on relative letter frequencies which get scaled down, when their sum is too large. Such algorithms are attractive in terms of memory requirements, running time and practical performance. However, there still is a lack of theoretical understanding. In this work we formulate a typical probability estimation algorithm based on relative frequencies and frequency discount, Algorithm RFD. Our main contribution is its theoretical analysis. We show that the code length it requires above an arbitrary piecewise stationary model with bounded and unbounded letter probabilities is small. This theoretically confirms the recency effect of periodic frequency discount, which has often been observed empirically.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Advanced Wireless Communication Techniques · Error Correcting Code Techniques
