Controllable Textual Inversion for Personalized Text-to-Image Generation
Jianan Yang, Haobo Wang, Yanming Zhang, Ruixuan Xiao, Sai Wu, Gang, Chen, Junbo Zhao

TL;DR
This paper introduces Controllable Textual Inversion (COTI), a robust, data-efficient method for personalized text-to-image generation that improves upon existing techniques by addressing key limitations with a novel loss and active learning.
Contribution
COTI enhances text inversion for personalized image generation by providing a theoretically-guided, active-learning-based framework that reduces data needs and improves robustness.
Findings
COTI achieves a 26.05 decrease in FID score.
COTI boosts R-precision by 23%.
Outperforms prior TI methods significantly.
Abstract
The recent large-scale generative modeling has attained unprecedented performance especially in producing high-fidelity images driven by text prompts. Text inversion (TI), alongside the text-to-image model backbones, is proposed as an effective technique in personalizing the generation when the prompts contain user-defined, unseen or long-tail concept tokens. Despite that, we find and show that the deployment of TI remains full of "dark-magics" -- to name a few, the harsh requirement of additional datasets, arduous human efforts in the loop and lack of robustness. In this work, we propose a much-enhanced version of TI, dubbed Controllable Textual Inversion (COTI), in resolving all the aforementioned problems and in turn delivering a robust, data-efficient and easy-to-use framework. The core to COTI is a theoretically-guided loss objective instantiated with a comprehensive and novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Topic Modeling · Multimodal Machine Learning Applications
