Controllable Generation via Locally Constrained Resampling
Kareem Ahmed, Kai-Wei Chang, Guy Van den Broeck

TL;DR
This paper introduces a probabilistic method for constrained language generation that improves adherence to logical and safety constraints by conditioning on the entire sequence, outperforming existing methods in detoxification and puzzle solving.
Contribution
The paper presents a novel Bayesian conditioning approach for autoregressive models that enables globally optimal constrained generation by local resampling and bias correction.
Findings
Outperforms existing detoxification methods in avoiding toxic outputs.
Achieves 100% accuracy on Sudoku puzzles, surpassing GPT-4 and Gemini 1.5.
Produces samples that closely match the target distribution while satisfying constraints.
Abstract
Autoregressive models have demonstrated an unprecedented ability at modeling the intricacies of natural language. However, they continue to struggle with generating complex outputs that adhere to logical constraints. Sampling from a fully-independent distribution subject to a constraint is hard. Sampling from an autoregressive distribution subject to a constraint is doubly hard: We have to contend not only with the hardness of the constraint but also the distribution's lack of structure. We propose a tractable probabilistic approach that performs Bayesian conditioning to draw samples subject to a constraint. Our approach considers the entire sequence, leading to a more globally optimal constrained generation than current greedy methods. Starting from a model sample, we induce a local, factorized distribution which we can tractably condition on the constraint. To generate samples that…
Peer Reviews
Decision·ICLR 2025 Poster
- Relevance/significance: As the authors write in the introduction, conditioning LMs with logical constraints is important but difficult. This work proposes a method that is guaranteed to generate samples that satisfy constraints and, unlike others, requires no Monte Carlo or model training. - Clarity: - The presentation of Sections 1-4 is, in my opinion, very good. The reviewer is familiar with circuits and graphical models, which certainly helps (a simple diagram appearing earlier in the pap
- Experiments: - Presentation: It would be good to see examples of the input and desired/undesired output for each task in the main text. It is hard to understand what is being done from text descriptions alone. - I do not find the results very convincing for a few reasons: - Error bars are not reported => impossible to assess significance. - Comparison with methods from prior work: only a trivial baseline is compared with for LLM detoxification, and only cold-prompted LLMs for Sudok
* The probabilistic circuits formulation moves beyond greedy token-by-step constraint enforcement. It is an interesting application of probabilistic circuits to a broadly applicable problem of constrained sampling in LLMs. * The logical circuits also proivde a more expressive and efficient constraint representation compared to traditional DFAs which are typically used in constrained sampling in LLMs. * The paper is quite well written, with easy to follow explanations for the idea, along with e
* A key aspect for sampling algorithms is the runtime, but from what I can tell there is no discussion of the runtime of the approach and how it compares to the baselines. Another aspect is the memory usage (which is also a challenge for large models), Another aspect which is unclear just from the results is how the method scales with the sequence length. * While results are promising as initial proof-of-concept, the experiments are mainly about relatively small-scale tasks. There is a limited
originality: - The method originally builds upon existing techniques in Bayesian inference and constraint circuits to tackle constrained generation with LLMs. - The presented approach seems to be a step change with respect to simpler greedy methods. quality: - The method is effective, leading to great performance. significance: - Constrained generation is a central problem with LLMs, so addressing it is essential.
The major issue with the paper is the lack of clarity in the notation and in presenting the method. In particular: - First, I don't find myself fully convinced that the greedy sampling approach leads to samples that are not exact, due to the following reasons: - it seems to me that greedy sampling is effectively identical (even though not operationally identical) to rejection sampling of full completions, by which I mean sampling multiple completions until they do not satisfy the constraint be
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · Numerical methods for differential equations
