DreamArtist++: Controllable One-Shot Text-to-Image Generation via   Positive-Negative Adapter

Ziyi Dong; Pengxu Wei; Liang Lin

arXiv:2211.11337·cs.CV·January 31, 2025·5 cites

DreamArtist++: Controllable One-Shot Text-to-Image Generation via Positive-Negative Adapter

Ziyi Dong, Pengxu Wei, Liang Lin

PDF

Open Access

TL;DR

DreamArtist++ introduces a positive-negative prompt-tuning framework for one-shot text-to-image generation, effectively balancing controllability, fidelity, and diversity using a joint optimization of positive and negative adapters on pre-trained diffusion models.

Contribution

The paper presents a novel positive-negative prompt-tuning strategy that improves one-shot image generation quality and controllability, outperforming existing methods.

Findings

01

Achieved superior image fidelity and diversity compared to existing methods.

02

Effectively captures reference characteristics with only one example.

03

Demonstrated versatility in concept composition and image editing tasks.

Abstract

State-of-the-arts text-to-image generation models such as Imagen and Stable Diffusion Model have succeed remarkable progresses in synthesizing high-quality, feature-rich images with high resolution guided by human text prompts. Since certain characteristics of image content \emph{e.g.}, very specific object entities or styles, are very hard to be accurately described by text, some example-based image generation approaches have been proposed, \emph{i.e.} generating new concepts based on absorbing the salient features of a few input references. Despite of acknowledged successes, these methods have struggled on accurately capturing the reference examples' characteristics while keeping diverse and high-quality image generation, particularly in the one-shot scenario (\emph{i.e.} given only one reference). To tackle this problem, we propose a simple yet effective framework, namely…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Video Analysis and Summarization · Multimodal Machine Learning Applications

MethodsDiffusion