Can Diffusion Models Learn Hidden Inter-Feature Rules Behind Images?

Yujin Han; Andi Han; Wei Huang; Chaochao Lu; Difan Zou

arXiv:2502.04725·cs.CV·February 10, 2025

Can Diffusion Models Learn Hidden Inter-Feature Rules Behind Images?

Yujin Han, Andi Han, Wei Huang, Chaochao Lu, Difan Zou

PDF

Open Access

TL;DR

This paper investigates the ability of diffusion models to learn hidden inter-feature rules in images, revealing their limitations in capturing fine-grained relationships and analyzing the reasons behind these failures.

Contribution

It provides empirical evidence of diffusion models' struggles with fine-grained rules, introduces synthetic tasks for assessment, and offers theoretical insights into the limitations of denoising score matching.

Findings

01

Diffusion models fail to learn fine-grained inter-feature rules.

02

Incorporating classifier guidance offers limited improvements.

03

Theoretical analysis shows DSM objectives cause constant errors in rule learning.

Abstract

Despite the remarkable success of diffusion models (DMs) in data generation, they exhibit specific failure cases with unsatisfactory outputs. We focus on one such limitation: the ability of DMs to learn hidden rules between image features. Specifically, for image data with dependent features ( $x$ ) and ( $y$ ) (e.g., the height of the sun ( $x$ ) and the length of the shadow ( $y$ )), we investigate whether DMs can accurately capture the inter-feature rule ( $p (y ∣ x)$ ). Empirical evaluations on mainstream DMs (e.g., Stable Diffusion 3.5) reveal consistent failures, such as inconsistent lighting-shadow relationships and mismatched object-mirror reflections. Inspired by these findings, we design four synthetic tasks with strongly correlated features to assess DMs' rule-learning abilities. Extensive experiments show that while DMs can identify…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques

MethodsDiffusion · Focus · Denoising Score Matching