Generalized Probability Smoothing

Christopher Mattern

arXiv:1712.02151·cs.IT·January 11, 2018

Generalized Probability Smoothing

Christopher Mattern

PDF

TL;DR

This paper provides a detailed code length analysis of a generalized Probability Smoothing method for sequential prediction, demonstrating its redundancy bounds relative to Piecewise Stationary Sources with finite alphabets.

Contribution

It introduces a generalized Probability Smoothing model and derives its redundancy bounds considering the total variation of stationary distributions.

Findings

01

Redundancy of $O(S\cdot\\sqrt{T\log T})$ for sequences of length T

02

Analysis applies to finite alphabet sources

03

Redundancy depends on the number of segments S

Abstract

In this work we consider a generalized version of Probability Smoothing, the core elementary model for sequential prediction in the state of the art PAQ family of data compression algorithms. Our main contribution is a code length analysis that considers the redundancy of Probability Smoothing with respect to a Piecewise Stationary Source. The analysis holds for a finite alphabet and expresses redundancy in terms of the total variation in probability mass of the stationary distributions of a Piecewise Stationary Source. By choosing parameters appropriately Probability Smoothing has redundancy $O (S \cdot T lo g T)$ for sequences of length $T$ with respect to a Piecewise Stationary Source with $S$ segments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.