StyleDrop: Text-to-Image Generation in Any Style
Kihyuk Sohn, Nataniel Ruiz, Kimin Lee, Daniel Castro Chin, Irina Blok,, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, Yuan Hao,, Irfan Essa, Michael Rubinstein, Dilip Krishnan

TL;DR
StyleDrop is a versatile method that enables text-to-image models to faithfully generate images in specific styles, capturing detailed nuances with minimal training data and parameters.
Contribution
It introduces StyleDrop, a novel fine-tuning approach that efficiently learns and reproduces complex styles using very few parameters and minimal data, outperforming existing methods.
Findings
StyleDrop outperforms DreamBooth and textual inversion in style transfer quality.
It effectively captures detailed style nuances from a single image.
The method requires less than 1% of model parameters for training.
Abstract
Pre-trained large text-to-image models synthesize impressive images with an appropriate use of text prompts. However, ambiguities inherent in natural language and out-of-distribution effects make it hard to synthesize image styles, that leverage a specific design pattern, texture or material. In this paper, we introduce StyleDrop, a method that enables the synthesis of images that faithfully follow a specific style using a text-to-image model. The proposed method is extremely versatile and captures nuances and details of a user-provided style, such as color schemes, shading, design patterns, and local and global effects. It efficiently learns a new style by fine-tuning very few trainable parameters (less than of total model parameters) and improving the quality via iterative training with either human or automated feedback. Better yet, StyleDrop is able to deliver impressive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Human Motion and Animation · Computer Graphics and Visualization Techniques
