DreamFit: Garment-Centric Human Generation via a Lightweight Anything-Dressing Encoder
Ente Lin, Xujie Zhang, Fuwei Zhao, Yuxuan Luo, Xin Dong, Long Zeng,, Xiaodan Liang

TL;DR
DreamFit introduces a lightweight, versatile diffusion model for garment-centric human image generation that maintains high quality and broad generalization with minimal training complexity.
Contribution
The paper presents DreamFit, a novel lightweight diffusion-based framework with adaptive attention and LoRA modules for high-quality, generalizable garment-centric human generation.
Findings
Outperforms existing methods on high-resolution benchmarks
Maintains high diversity and consistency across various garments and styles
Requires only 83.4M trainable parameters for efficient training
Abstract
Diffusion models for garment-centric human generation from text or image prompts have garnered emerging attention for their great application potential. However, existing methods often face a dilemma: lightweight approaches, such as adapters, are prone to generate inconsistent textures; while finetune-based methods involve high training costs and struggle to maintain the generalization capabilities of pretrained diffusion models, limiting their performance across diverse scenarios. To address these challenges, we propose DreamFit, which incorporates a lightweight Anything-Dressing Encoder specifically tailored for the garment-centric human generation. DreamFit has three key advantages: (1) \textbf{Lightweight training}: with the proposed adaptive attention and LoRA modules, DreamFit significantly minimizes the model complexity to 83.4M trainable parameters.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsInnovative Human-Technology Interaction
MethodsSoftmax · Attention Is All You Need · Diffusion
