Scenimefy: Learning to Craft Anime Scene via Semi-Supervised Image-to-Image Translation
Yuxin Jiang, Liming Jiang, Shuai Yang, Chen Change Loy

TL;DR
Scenimefy is a semi-supervised framework that transforms real-world images into anime scenes, leveraging pseudo data, segmentation, and contrastive loss to improve stylization, details, and semantic consistency.
Contribution
It introduces a novel semi-supervised image translation method using structure-consistent pseudo data and a high-resolution anime dataset, advancing anime scene rendering quality.
Findings
Outperforms state-of-the-art methods in perceptual quality
Achieves better semantic preservation and stylization
Provides a new high-resolution anime scene dataset
Abstract
Automatic high-quality rendering of anime scenes from complex real-world images is of significant practical value. The challenges of this task lie in the complexity of the scenes, the unique features of anime style, and the lack of high-quality datasets to bridge the domain gap. Despite promising attempts, previous efforts are still incompetent in achieving satisfactory results with consistent semantic preservation, evident stylization, and fine details. In this study, we propose Scenimefy, a novel semi-supervised image-to-image translation framework that addresses these challenges. Our approach guides the learning with structure-consistent pseudo paired data, simplifying the pure unsupervised setting. The pseudo data are derived uniquely from a semantic-constrained StyleGAN leveraging rich model priors like CLIP. We further apply segmentation-guided data selection to obtain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Scenimefy: Learning to Craft Anime Scene via Semi-Supervised Image-to-Image Translation· youtube
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Advanced Vision and Imaging
MethodsDense Connections · Convolution · HuMan(Expedia)||How do I get a human at Expedia? · R1 Regularization · Contrastive Language-Image Pre-training · Feedforward Network · Adaptive Instance Normalization · StyleGAN
