TL;DR
This paper introduces Diffusive Classification (DiffCLF), a new training objective for energy-based models that improves efficiency and effectiveness, enabling better generative and sampling tasks.
Contribution
The paper proposes DiffCLF, a supervised classification-based training method for EBMs that overcomes mode blindness and enhances model fidelity and applicability.
Findings
DiffCLF accurately estimates energies in Gaussian mixtures.
Models trained with DiffCLF perform well in composition and Boltzmann sampling tasks.
DiffCLF outperforms existing EBM training methods in efficiency and quality.
Abstract
Score-based generative models have recently achieved remarkable success. While they are usually parameterized by the score, an alternative way is to use a series of time-dependent energy-based models (EBMs), where the score is obtained from the negative input-gradient of the energy. Crucially, EBMs can be leveraged not only for generation, but also for tasks such as compositional sampling or building Boltzmann Generators via Monte Carlo methods. However, training EBMs remains challenging. Direct maximum likelihood is computationally prohibitive due to the need for nested sampling, while score matching, though efficient, suffers from mode blindness. To address these issues, we introduce the Diffusive Classification (DiffCLF) objective, a simple method that avoids blindness while remaining computationally efficient. DiffCLF reframes EBM learning as a supervised classification problem…
Peer Reviews
Decision·Submitted to ICLR 2026
The proposed loss function for training time-dependent energy functions seem to be novel and interesting.
Overall the paper is rather poorly written. While the manuscript spends several pages introducing the background, the key part of the proposed approach (section 3) is rather brief and needs further elaboration. In its current form, the objective function is not clearly explained, in particular why such objective is used. Moreover, the statement of the theoretical result is not precise and seems to miss assumptions. For example, the authors claimed that the score-matching methods suffer from t
* Addresses a real limitation of current diffusion models (mode blindness) by attempting to model energies, not only scores. * The idea of connecting diffusion training with a classification/NCE-type loss is novel and potentially useful. * Theoretical motivation and experiments are at least qualitatively consistent with the intended effect.
* The paper is very difficult to follow. Section 2 in particular is chaotic: notation such as $X_t$, $Y_t$, $p_t$, $q_t$, $S(t)$, $\sigma(t)$ is introduced with little intuition or connection, and the relationship between the data distribution and the time-evolving process is unclear. * The stated objective “to estimate the densities $(p_t)_t$” is conceptually confusing—the goal should be to model the data distribution $p_0$, not all intermediate marginals. * Equations (6)–(7) appear without suf
- The proposed method allows to simultaneously learn the energy function and the normalizing constant, which is known to be a challenging task. - The authors provide a rich literature overview and connect their works to many related works in generative modelling, in particular with score-based models. - The proposed approach provides consistenly better classification performance than the alternatives.
- The proposed method provides an interesting framework to train simultaneously the energy function and the lognormalizing constant. However, the alternative considered in the paper, such as score-based training, have been widely studied these past few years. Wasserstein and KL upper bounds have been proposed in the strong log concave case and under weaker assuptions on the data distribution. As the method highlights better performance in classification loss, it would be very interesting to obt
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference · Machine Learning in Healthcare
