Distribution Compression in Near-linear Time
Abhishek Shetty, Raaz Dwivedi, Lester Mackey

TL;DR
This paper introduces Compress++, a meta-procedure that accelerates distribution compression algorithms to near-linear time while maintaining accuracy, enabling efficient summarization of probability distributions in high-dimensional settings.
Contribution
Compress++ is a novel meta-algorithm that significantly speeds up existing thinning algorithms for distribution compression, achieving near-linear runtime with minimal error increase.
Findings
Compress++ reduces runtime of distribution compression algorithms to near-linear.
It maintains high accuracy with only a factor of 4 error increase.
Benchmarks show it matches or exceeds the accuracy of input algorithms in much less time.
Abstract
In distribution compression, one aims to accurately summarize a probability distribution using a small number of representative points. Near-optimal thinning procedures achieve this goal by sampling points from a Markov chain and identifying points with discrepancy to . Unfortunately, these algorithms suffer from quadratic or super-quadratic runtime in the sample size . To address this deficiency, we introduce Compress++, a simple meta-procedure for speeding up any thinning algorithm while suffering at most a factor of in error. When combined with the quadratic-time kernel halving and kernel thinning algorithms of Dwivedi and Mackey (2021), Compress++ delivers points with integration error and better-than-Monte-Carlo maximum mean discrepancy in $\mathcal{O}(n…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMathematical Approximation and Integration · Markov Chains and Monte Carlo Methods · Generative Adversarial Networks and Image Synthesis
