When Are Two Scores Better Than One? Investigating Ensembles of Diffusion Models
Rapha\"el Razafindralambo, R\'emy Sun, Fr\'ed\'eric Precioso, Damien Garreau, Pierre-Alexandre Mattei

TL;DR
This paper investigates the effectiveness of ensembling diffusion models, finding improvements in likelihood but inconsistent gains in perceptual quality, and provides theoretical insights into score model aggregation.
Contribution
It systematically evaluates ensembling methods for diffusion models, revealing their impact on likelihood and quality, and offers theoretical analysis of score model summation techniques.
Findings
Ensembling improves score-matching loss and likelihood.
Ensembling does not consistently improve perceptual quality metrics.
A specific aggregation strategy outperforms others in tabular data.
Abstract
Diffusion models now generate high-quality, diverse samples, with an increasing focus on more powerful models. Although ensembling is a well-known way to improve supervised models, its application to unconditional score-based diffusion models remains largely unexplored. In this work we investigate whether it provides tangible benefits for generative modelling. We find that while ensembling the scores generally improves the score-matching loss and model likelihood, it fails to consistently enhance perceptual quality metrics such as FID on image datasets. We confirm this observation across a breadth of aggregation rules using Deep Ensembles, Monte Carlo Dropout, on CIFAR-10 and FFHQ. We attempt to explain this discrepancy by investigating possible explanations, such as the link between score estimation and image quality. We also look into tabular data through random forests, and find that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image and Video Quality Assessment · Advanced Neuroimaging Techniques and Applications
