List-Level Distribution Coupling with Applications to Speculative Decoding and Lossy Compression
Joseph Rowan, Buu Phan, Ashish Khisti

TL;DR
This paper introduces a novel list-level distribution coupling method extending Gumbel-max sampling, with applications to speculative decoding and lossy compression, providing theoretical bounds and demonstrating competitive experimental results.
Contribution
It proposes a new sampling technique for distribution coupling, establishes a lower bound on acceptance probability, and applies these to improve speculative decoding and distributed lossy compression.
Findings
Achieves competitive performance in language tasks with simple implementation.
Provides a theoretical lower bound on acceptance probability.
Demonstrates significant gains in lossy compression experiments.
Abstract
We study a relaxation of the problem of coupling probability distributions -- a list of samples is generated from one distribution and an accept is declared if any one of these samples is identical to the sample generated from the other distribution. We propose a novel method for generating samples, which extends the Gumbel-max sampling suggested in Daliri et al. (arXiv:2408.07978) for coupling probability distributions. We also establish a corresponding lower bound on the acceptance probability, which we call the list matching lemma. We next discuss two applications of our setup. First, we develop a new mechanism for multi-draft speculative sampling that is simple to implement and achieves performance competitive with baselines such as SpecTr and SpecInfer across a range of language tasks. Our method also guarantees a certain degree of drafter invariance with respect to the output…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Data Compression Techniques · Advanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis
