TL;DR
This paper introduces a real-time, class-based style transfer method that applies different styles to objects in a video using semantic segmentation, enabling high-quality localized styling at 104 FPS.
Contribution
The method combines semantic segmentation with style transfer to achieve real-time, localized style application for multiple object classes in videos.
Findings
High-quality localized style transfer on CityScapes dataset
Achieves 104 FPS performance
Uses lightweight DABNet for segmentation
Abstract
We propose a Class-Based Styling method (CBS) that can map different styles for different object classes in real-time. CBS achieves real-time performance by carrying out two steps simultaneously. While a semantic segmentation method is used to obtain the mask of each object class in a video frame, a styling method is used to style that frame globally. Then an object class can be styled by combining the segmentation mask and the styled image. The user can also select multiple styles so that different object classes can have different styles in a single frame. For semantic segmentation, we leverage DABNet that achieves high accuracy, yet only has 0.76 million parameters and runs at 104 FPS. For the style transfer step, we use a popular real-time method proposed by Johnson et al. [7]. We evaluated CBS on a video of the CityScapes dataset and observed high-quality localized style transfer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
