Feature-guided score diffusion for sampling conditional densities

Zahra Kadkhodaie; St\'ephane Mallat; Eero P. Simoncelli

arXiv:2410.11646·cs.CV·October 16, 2024

Feature-guided score diffusion for sampling conditional densities

Zahra Kadkhodaie, St\'ephane Mallat, Eero P. Simoncelli

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a novel feature-guided score diffusion method that improves conditional density sampling by guiding the diffusion process with projected scores based on class feature vectors, resulting in high-quality, diverse, and generalizable samples.

Contribution

The authors propose a new algorithm that guides score diffusion with projected scores learned jointly with feature vectors, enabling better conditional density estimation and out-of-distribution generalization.

Findings

01

Generated high-quality, diverse samples from conditioned classes.

02

Feature vectors form a low-dimensional Euclidean embedding of class densities.

03

Interpolation of feature vectors enables out-of-distribution generation.

Abstract

Score diffusion methods can learn probability densities from samples. The score of the noise-corrupted density is estimated using a deep neural network, which is then used to iteratively transport a Gaussian white noise density to a target density. Variants for conditional densities have been developed, but correct estimation of the corresponding scores is difficult. We avoid these difficulties by introducing an algorithm that guides the diffusion with a projected score. The projection pushes the image feature vector towards the feature vector centroid of the target class. The projected score and the feature vectors are learned by the same network. Specifically, the image feature vector is defined as the spatial averages of the channels activations in select layers of the network. Optimizing the projected score for denoising loss encourages image feature vectors of each class to cluster…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 6Confidence 4

Strengths

1. The paper is well-structured, easy to read, and highly innovative, introducing a projected score embedded in feature space. This embedding is not only straightforward to obtain (directly extracted from the score estimation model) but also adheres to Euclidean interpolation properties. 2. The model successfully achieves conditional generation in a mixture of Gaussian distributions, demonstrating that the feature-guided score diffusion model can accurately capture conditional density—an abilit

Weaknesses

The dataset used in experiments is overly simple. The training dataset is derived by cropping 1700 images into 234k patches. Although the patches are non-overlapping, the data distribution for each class lacks sufficient diversity. Experiments on a more complex dataset, like ImageNet, would strengthen the paper’s validity.

Reviewer 02Rating 8Confidence 3

Strengths

This is a well-written paper. Both score and feature vectors are represented with the same network. The learned feature vectors cluster around their centroids, which enhances the accuracy of sampling rom the conditional probability density. The method enables gradual transitions of the images between classes through linear interpolation of mean feature vectors. The experimental results show that a diffusion algoriothm based on the projected score provides an accurate sampling of conditional prob

Weaknesses

The authors provided a way to build the feature vectors that share the same network weights as the score function. It is not clear how to determine the feature vector dimension.

Reviewer 03Rating 5Confidence 4

Strengths

- Classifier free guidance (CFG) is the most dominant approach for guiding diffusion models today, even though it is known to lead to biased densities. Several recent papers analyzed the drawbacks of the approach from a theoretical standpoint. However the topic of designing good practical alternatives to CFG is still under-explored. This paper attempts to fill this gap, which is undoubtedly an important goal. - The paper presents clear intuition and empirically validates that the assumptions und

Weaknesses

- The whole motivation of the paper is to propose an alternative to existing guidance methods. However, it does not provide theoretical guarantees that the approach samples from the conditional distribution. And it also does not provide any empirical evidence that the proposed approach outperforms the standard way of conditioning diffusion models. Specifically, it does not compare the sampling quality to that obtained with a conditional denoiser (with the common conditioning mechanism for U-Nets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Gaussian Processes and Bayesian Inference · Neural Networks and Applications

MethodsDiffusion