Unsupervised Coherent Video Cartoonization with Perceptual Motion Consistency
Zhenhuan Liu, Liang Li, Huajie Jiang, Xin Jin, Dandan Tu, Shuhui Wang,, Zheng-Jun Zha

TL;DR
This paper introduces an unsupervised framework for coherent video cartoonization that maintains temporal consistency and style quality using perceptual motion and semantic alignment techniques.
Contribution
It proposes a novel spatially-adaptive semantic alignment and a style-independent regularization method for unsupervised coherent video cartoonization.
Findings
Achieves highly stylistic cartoon videos with temporal consistency
Outperforms existing methods in qualitative and quantitative evaluations
Effectively disentangles style from temporal information
Abstract
In recent years, creative content generations like style transfer and neural photo editing have attracted more and more attention. Among these, cartoonization of real-world scenes has promising applications in entertainment and industry. Different from image translations focusing on improving the style effect of generated images, video cartoonization has additional requirements on the temporal consistency. In this paper, we propose a spatially-adaptive semantic alignment framework with perceptual motion consistency for coherent video cartoonization in an unsupervised manner. The semantic alignment module is designed to restore deformation of semantic structure caused by spatial information lost in the encoder-decoder architecture. Furthermore, we devise the spatio-temporal correlative map as a style-independent, global-aware regularization on the perceptual motion consistency. Deriving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Video Analysis and Summarization
