Coupling without Communication and Drafter-Invariant Speculative Decoding

Majid Daliri; Christopher Musco; Ananda Theertha Suresh

arXiv:2408.07978·cs.DS·August 21, 2025

Coupling without Communication and Drafter-Invariant Speculative Decoding

Majid Daliri, Christopher Musco, Ananda Theertha Suresh

PDF

Open Access 1 Repo

TL;DR

This paper explores communication-free coupling of distributions, introduces Gumbel sampling as an improvement over Weighted MinHash, and applies these methods to speculative decoding in language models, achieving better success probabilities.

Contribution

The paper provides a simpler proof of optimality for communication-free coupling and introduces Gumbel sampling as a Pareto improvement over Weighted MinHash, with practical applications in language model decoding.

Findings

01

Gumbel sampling achieves higher success probability than Weighted MinHash.

02

Communication-free protocols can be used for fixed-output speculative decoding.

03

Gumbel sampling outperforms Weighted MinHash in language generation experiments.

Abstract

Suppose Alice has a distribution $P$ and Bob has a distribution $Q$ . Alice wants to draw a sample $a \sim P$ and Bob a sample $b \sim Q$ such that $a = b$ with as high of probability as possible. It is well-known that, by sampling from an optimal coupling between the distributions, Alice and Bob can achieve $Pr [a = b] = 1 - D_{T V} (P, Q)$ , where $D_{T V} (P, Q)$ is the total variation distance between $P$ and $Q$ . What if Alice and Bob must solve this same problem \emph{without communicating at all?} Perhaps surprisingly, with access to public randomness, they can still achieve $Pr [a = b] \geq \frac{1 - D _{T V} ( P , Q )}{1 + D _{T V} ( P , Q )} \geq 1 - 2 D_{T V} (P, Q)$ using a simple protocol based on the Weighted MinHash algorithm. This bound was shown to be optimal in the worst-case by [Bavarian et al., 2020]. In this work, we revisit the communication-free coupling problem. We provide a simpler proof of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

majid-daliri/disd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Database Systems and Queries