Discrete Feynman-Kac Correctors
Mohsin Hasan, Viktor Ohanesian, Artem Gazizov, Yoshua Bengio, Al\'an Aspuru-Guzik, Roberto Bondesan, Marta Skreta, Kirill Neklyudov

TL;DR
This paper introduces Discrete Feynman-Kac Correctors, a framework that enables flexible control over the distribution of discrete diffusion models during inference, enhancing their applicability in various domains without additional training.
Contribution
The authors develop a novel SMC-based framework for controlling discrete diffusion models at inference time, allowing for distribution manipulation without retraining.
Findings
Effective sampling from the Ising model's Boltzmann distribution
Improved language model performance for code generation
Enhanced reward-tilted protein sequence generation
Abstract
Discrete diffusion models have recently emerged as a promising alternative to the autoregressive approach for generating discrete sequences. Sample generation via gradual denoising or demasking processes allows them to capture hierarchical non-sequential interdependencies in the data. These custom processes, however, do not assume a flexible control over the distribution of generated samples. We propose Discrete Feynman-Kac Correctors, a framework that allows for controlling the generated distribution of discrete masked diffusion models at inference time. We derive Sequential Monte Carlo (SMC) algorithms that, given a trained discrete diffusion model, control the temperature of the sampled distribution (i.e. perform annealing), sample from the product of marginals of several diffusion processes (e.g. differently conditioned processes), and sample from the product of the marginal with an…
Peer Reviews
Decision·Submitted to ICLR 2026
1. **Clear and ambitious motivation:** The paper tackles an important and current challenge in discrete generative modeling — how to control discrete diffusion models at inference time — an area that has received much less attention than its continuous counterpart. 2. **Elegant theoretical formulation:** The authors successfully extend the Feynman–Kac Corrector framework from continuous stochastic differential equations to discrete-state continuous-time Markov chains (CTMCs), providing a clean m
1. **Lack of formal convergence guarantees:** The paper derives correct and interpretable rate equations, but the convergence properties of the Sequential Monte Carlo (SMC) estimators in the discrete setting are not analyzed in depth. There are no explicit results on variance, bias, or sample complexity, which weakens the theoretical completeness of the 2. **Assumption-heavy derivations:** Several key results rely on strong idealizations, such as perfect knowledge of the marginal ratios or ergo
- The manuscript is well-organized and clearly written. The exposition is concise yet thorough, facilitating a clear understanding of the core contributions and methodologies. - The paper presents theoretically rigorous and well-founded derivations. The mathematical treatment of annealing, distribution product formation, and reward tilting via reweighting and SMC methods is sound.
- Unclear motivation: Given the presented and evaluated inference alignment strategies, their usefulness is not convincingly demonstrated. While there is previous work showing that annealing can be beneficial when sampling from Boltzmann distributions, the current manuscript presents this possible advantage for using FKC only in a rather toy-like experiment. Potential benefits for language models are only shown for synthetic and toy tasks rather than real world language modeling at scale. Reward
- The paper presents a theoretically sound and novel approach. - The method is rigorously evaluated across a broad spectrum of benchmarks, demonstrating its versatility and potential applicability.
- The benchmarks on the Ising model lack comparison to theoretically available ground truth solutions, which would strengthen the validation of the proposed method. - The evaluation does not include the critical temperature regime of the Ising model, where sampling is known to be particularly challenging. - The notation $g_t(i)$ appears in Line 122 but is not formally introduced or defined elsewhere in the text. - With the exception of Figure 4b, the paper does not provide direct comparisons to
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Markov Chains and Monte Carlo Methods · Stochastic Gradient Optimization Techniques
