Permute-and-Flip: An optimally stable and watermarkable decoder for LLMs
Xuandong Zhao, Lei Li, Yu-Xiang Wang

TL;DR
The paper introduces Permute-and-Flip, a novel decoding method for large language models that offers superior stability and quality tradeoffs, along with a cryptographic watermarking scheme that preserves distribution and enhances security.
Contribution
It presents a new decoding algorithm with provable stability advantages and a compatible watermarking scheme tailored for LLMs, improving upon existing methods.
Findings
PF decoder outperforms naive sampling in perplexity
Watermarking scheme achieves low false positive rate
Decoding method maintains stability comparable to standard sampling
Abstract
In this paper, we propose a new decoding method called Permute-and-Flip (PF) decoder. It enjoys stability properties similar to the standard sampling decoder, but is provably up to 2x better in its quality-stability tradeoff than sampling and never worse than any other decoder. We also design a cryptographic watermarking scheme analogous to Aaronson (2023)'s Gumbel watermark, but naturally tailored for PF decoder. The watermarking scheme does not change the distribution to sample, while allowing arbitrarily low false positive rate and high recall whenever the generated text has high entropy. Our experiments show that the PF decoder (and its watermarked counterpart) significantly outperform(s) naive sampling (and its Gumbel watermarked counterpart) in terms of perplexity, while retaining the same stability (and detectability), hence making it a promising new approach for LLM decoding.…
Peer Reviews
Decision·ICLR 2025 Poster
This paper introduces a new decoding method for large language models, called Permute-and-Flip (PF) decoding, along with a watermarking technique that allows precise control over false positive rates while maintaining high true positive rates. As a result, the approach balances detection accuracy and low perplexity, making it effective for generating high-quality text while preserving watermark detectability.
- While the PF watermark optimizes better, it reduces entropy in the distribution, potentially weakening the statistical signals that the watermarking scheme can leverage. This suggests that PF decoding may prioritize lower perplexity at the expense of detectability, raising questions about the balance between watermarking efficacy and text quality. - PF decoding requires tuning the temperature T parameter to maximize detectability. Incorrect temperature settings could degrade performance, poten
This paper successfully combines the results of previous studies such as McKenna & Sheldon (2020) and Aaronson (2023) in the context of LLMs research. Theoretical guarantees are well organized and experimental performance verification is carried out sufficiently, and those make this study reliable. The proposed method seems to have practical promise.
I found no serious weakness.
The application of the Permute and Flip (PF) mechanism to sampling text from a language model is an interesting idea. The paper supports its main claims (about the effectiveness of the watermark) with both theoretical and empirical evidence.
Some parts of the overall problem formulation are not clear. For example, the stated goal is to maximize some utility function, in which case—unlike with differential privacy (i.e., the original context in which PF was proposed)—the optimal decoding strategy should be deterministic. The paper acknowledges as much but nonetheless argues in favor of using sampling via the following: > That’s because there are other considerations besides text quality when selecting LLM decoders. For example, comp
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression
