A Machine Learning Perspective on Predictive Coding with PAQ
Byron Knoll, Nando de Freitas

TL;DR
This paper analyzes the PAQ8 lossless compression algorithm through a machine learning lens, offering insights into its modules, proposing improvements, and exploring its applications in various machine learning tasks.
Contribution
It provides a detailed statistical perspective on PAQ8, suggesting ways to enhance it and transfer knowledge to other machine learning models like neural networks.
Findings
Understanding of some PAQ8 modules from a statistical perspective
Proposed improvements to PAQ8 based on this understanding
Application of PAQ to diverse machine learning tasks such as language modeling and game playing
Abstract
PAQ8 is an open source lossless data compression algorithm that currently achieves the best compression rates on many benchmarks. This report presents a detailed description of PAQ8 from a statistical machine learning perspective. It shows that it is possible to understand some of the modules of PAQ8 and use this understanding to improve the method. However, intuitive statistical explanations of the behavior of other modules remain elusive. We hope the description in this report will be a starting point for discussions that will increase our understanding, lead to improvements to PAQ8, and facilitate a transfer of knowledge from PAQ8 to other machine learning methods, such a recurrent neural networks and stochastic memoizers. Finally, the report presents a broad range of new applications of PAQ to machine learning tasks including language modeling and adaptive text prediction, adaptive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Computability, Logic, AI Algorithms · Numerical Methods and Algorithms
