DreamArtist++: Controllable One-Shot Text-to-Image Generation via Positive-Negative Adapter
Ziyi Dong, Pengxu Wei, Liang Lin

TL;DR
DreamArtist++ introduces a positive-negative prompt-tuning framework for one-shot text-to-image generation, effectively balancing controllability, fidelity, and diversity using a joint optimization of positive and negative adapters on pre-trained diffusion models.
Contribution
The paper presents a novel positive-negative prompt-tuning strategy that improves one-shot image generation quality and controllability, outperforming existing methods.
Findings
Achieved superior image fidelity and diversity compared to existing methods.
Effectively captures reference characteristics with only one example.
Demonstrated versatility in concept composition and image editing tasks.
Abstract
State-of-the-arts text-to-image generation models such as Imagen and Stable Diffusion Model have succeed remarkable progresses in synthesizing high-quality, feature-rich images with high resolution guided by human text prompts. Since certain characteristics of image content \emph{e.g.}, very specific object entities or styles, are very hard to be accurately described by text, some example-based image generation approaches have been proposed, \emph{i.e.} generating new concepts based on absorbing the salient features of a few input references. Despite of acknowledged successes, these methods have struggled on accurately capturing the reference examples' characteristics while keeping diverse and high-quality image generation, particularly in the one-shot scenario (\emph{i.e.} given only one reference). To tackle this problem, we propose a simple yet effective framework, namely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Video Analysis and Summarization · Multimodal Machine Learning Applications
MethodsDiffusion
