RAGDiffusion: Faithful Cloth Generation via External Knowledge Assimilation
Xianfeng Tan, Yuhan Li, Wenxiang Shang, Yubo Wu, Jian Wang, Xuanhong Chen, Yi Zhang, Ran Lin, Bingbing Ni

TL;DR
RAGDiffusion is a novel framework that combines retrieval-augmented generation with diffusion models to produce highly faithful clothing images, addressing structural hallucinations and texture distortions in complex scenarios.
Contribution
It introduces a retrieval-based structure aggregation and omni-level texture alignment, pioneering high-fidelity garment generation with external knowledge integration.
Findings
Significant improvement in structural accuracy and texture fidelity.
Effective mitigation of hallucinations in complex clothing scenarios.
Outperforms existing models on real-world datasets.
Abstract
Standard clothing asset generation involves restoring forward-facing flat-lay garment images displayed on a clear background by extracting clothing information from diverse real-world contexts, which presents significant challenges due to highly standardized structure sampling distributions and clothing semantic absence in complex scenarios. Existing models have limited spatial perception, often exhibiting structural hallucinations and texture distortion in this high-specification generative task. To address this issue, we propose a novel Retrieval-Augmented Generation (RAG) framework, termed RAGDiffusion, to enhance structure determinacy and mitigate hallucinations by assimilating knowledge from language models and external databases. RAGDiffusion consists of two processes: (1) Retrieval-based structure aggregation, which employs contrastive learning and a Structure Locally Linear…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFashion and Cultural Textiles
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Weight Decay · Linear Warmup With Linear Decay · Linear Layer · Layer Normalization · WordPiece · Attention Dropout · Multi-Head Attention · Byte Pair Encoding · BERT
