Styleclone: Face Stylization with Diffusion Based Data Augmentation
Neeraj Matiyali, Siddharth Srivastava, Gaurav Sharma

TL;DR
StyleClone introduces a diffusion-based data augmentation technique that enhances face stylization in image-to-image translation networks, especially with limited style data, resulting in faster and higher-quality stylization.
Contribution
It pioneers the use of textual inversion and diffusion-guided augmentation to improve face stylization with small datasets, outperforming existing diffusion methods in speed and quality.
Findings
Enhanced stylization quality and content preservation
Significant acceleration of inference speed
Effective augmentation of small style datasets
Abstract
We present StyleClone, a method for training image-to-image translation networks to stylize faces in a specific style, even with limited style images. Our approach leverages textual inversion and diffusion-based guided image generation to augment small style datasets. By systematically generating diverse style samples guided by both the original style images and real face images, we significantly enhance the diversity of the style dataset. Using this augmented dataset, we train fast image-to-image translation networks that outperform diffusion-based methods in speed and quality. Experiments on multiple styles demonstrate that our method improves stylization quality, better preserves source image content, and significantly accelerates inference. Additionally, we provide a systematic evaluation of the augmentation techniques and their impact on stylization performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
