EZIGen: Enhancing zero-shot personalized image generation with precise subject encoding and decoupled guidance
Zicheng Duan, Yuxuan Ding, Chenhui Gou, Ziqin Zhou, Ethan Smith,, Lingqiao Liu

TL;DR
EZIGen is a novel method that improves zero-shot personalized image generation by effectively balancing text and subject guidance, utilizing a fixed diffusion model as subject encoder, and achieving state-of-the-art results with minimal training data.
Contribution
The paper introduces EZIGen, a new approach that employs a fixed pre-trained diffusion model as subject encoder and separates guidance stages, leading to superior personalized image generation with less data.
Findings
Achieves state-of-the-art performance on personalized generation benchmarks.
Uses 100 times less training data than previous methods.
Demonstrates versatility across different diffusion models like SD2.1 and SDXL.
Abstract
Zero-shot personalized image generation models aim to produce images that align with both a given text prompt and subject image, requiring the model to incorporate both sources of guidance. Existing methods often struggle to capture fine-grained subject details and frequently prioritize one form of guidance over the other, resulting in suboptimal subject encoding and imbalanced generation. In this study, we uncover key insights into overcoming such drawbacks, notably that 1) the choice of the subject image encoder critically influences subject identity preservation and training efficiency, and 2) the text and subject guidance should take effect at different denoising stages. Building on these insights, we introduce a new approach, EZIGen, that employs two main components: leveraging a fixed pre-trained Diffusion UNet itself as subject encoder, following a process that balances the two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing Techniques and Applications · Cell Image Analysis Techniques · Advanced Image and Video Retrieval Techniques
MethodsALIGN · Diffusion
