EruDiff: Refactoring Knowledge in Diffusion Models for Advanced Text-to-Image Synthesis

Xiefan Guo; Xinzhu Ma; Haoxiang Ma; Zihao Zhou; Di Huang

arXiv:2603.20828·cs.CV·March 24, 2026

EruDiff: Refactoring Knowledge in Diffusion Models for Advanced Text-to-Image Synthesis

Xiefan Guo, Xinzhu Ma, Haoxiang Ma, Zihao Zhou, Di Huang

PDF

Open Access

TL;DR

EruDiff introduces a method to improve text-to-image diffusion models by refactoring their knowledge structures, enabling better handling of implicit prompts requiring deep world knowledge, and demonstrates significant performance gains.

Contribution

The paper proposes DK-DM and NO-RL strategies to align knowledge distributions and correct biases in diffusion models, enhancing their ability to process implicit prompts.

Findings

01

Significant performance improvements on Science-T2I and WISE benchmarks.

02

Enhanced handling of implicit prompts requiring deep world knowledge.

03

Effective knowledge refactoring in diffusion models demonstrated.

Abstract

Text-to-image diffusion models have achieved remarkable fidelity in synthesizing images from explicit text prompts, yet exhibit a critical deficiency in processing implicit prompts that require deep-level world knowledge, ranging from natural sciences to cultural commonsense, resulting in counter-factual synthesis. This paper traces the root of this limitation to a fundamental dislocation of the underlying knowledge structures, manifesting as a chaotic organization of implicit prompts compared to their explicit counterparts. In this paper, we propose EruDiff, which aims to refactor the knowledge within diffusion models. Specifically, we develop the Diffusion Knowledge Distribution Matching (DK-DM) to register the knowledge distribution of intractable implicit prompts with that of well-defined explicit anchors. Furthermore, to rectify the inherent biases in explicit prompt rendering, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Model Reduction and Neural Networks