On Finite Memory Universal Data Compression and Classification of   Individual Sequences

Jacob Ziv

arXiv:cs/0612019·cs.IT·January 25, 2013

On Finite Memory Universal Data Compression and Classification of Individual Sequences

Jacob Ziv

PDF

Open Access

TL;DR

This paper explores the limits of finite-memory universal data compression and classification for individual sequences, demonstrating that context tree coding and a specific classifier are essentially optimal for these tasks.

Contribution

It shows that context tree coding nearly achieves the best possible universal compression for finite blocks and introduces an optimal universal classifier with linear storage complexity.

Findings

01

Context tree coding nearly achieves optimal universal compression.

02

A universal context classifier with linear storage is essentially optimal.

03

The results support the theoretical basis of PST in learning and biology.

Abstract

Consider the case where consecutive blocks of N letters of a semi-infinite individual sequence X over a finite-alphabet are being compressed into binary sequences by some one-to-one mapping. No a-priori information about X is available at the encoder, which must therefore adopt a universal data-compression algorithm. It is known that if the universal LZ77 data compression algorithm is successively applied to N-blocks then the best error-free compression for the particular individual sequence X is achieved, as $N$ tends to infinity. The best possible compression that may be achieved by any universal data compression algorithm for finite N-blocks is discussed. It is demonstrated that context tree coding essentially achieves it. Next, consider a device called classifier (or discriminator) that observes an individual training sequence X. The classifier's task is to examine individual test…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Computability, Logic, AI Algorithms · Machine Learning and Algorithms