DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment
Xujie Zhang, Binbin Yang, Michael C. Kampffmeyer, Wenqing Zhang,, Shiyue Zhang, Guansong Lu, Liang Lin, Hang Xu, Xiaodan Liang

TL;DR
DiffCloth introduces a diffusion-based method for cross-modal garment synthesis and manipulation that aligns visual and textual semantics structurally, enabling more accurate and flexible fashion design modifications.
Contribution
It proposes a novel structural cross-modal alignment approach using bipartite matching and semantic-bundled cross-attention, improving garment synthesis and manipulation over existing methods.
Findings
Achieves state-of-the-art results on CM-Fashion benchmark.
Enables flexible garment manipulation via attribute phrase replacement.
Preserves spatial structure during synthesis and editing.
Abstract
Cross-modal garment synthesis and manipulation will significantly benefit the way fashion designers generate garments and modify their designs via flexible linguistic interfaces.Current approaches follow the general text-to-image paradigm and mine cross-modal relations via simple cross-attention modules, neglecting the structural correspondence between visual and textual representations in the fashion design domain. In this work, we instead introduce DiffCloth, a diffusion-based pipeline for cross-modal garment synthesis and manipulation, which empowers diffusion models with flexible compositionality in the fashion domain by structurally aligning the cross-modal semantics. Specifically, we formulate the part-level cross-modal alignment as a bipartite matching problem between the linguistic Attribute-Phrases (AP) and the visual garment parts which are obtained via constituency parsing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Fashion and Cultural Textiles
MethodsDiffusion
