TeleStyle: Content-Preserving Style Transfer in Images and Videos
Shiwen Zhang, Xiaoyan Yang, Bojia Zi, Haibin Huang, Chi Zhang, Xuelong Li

TL;DR
TeleStyle is a novel, lightweight model for content-preserving style transfer in images and videos, utilizing curriculum learning and a new dataset to achieve state-of-the-art results in style similarity, content fidelity, and visual quality.
Contribution
The paper introduces TeleStyle, a new model that combines curriculum continual learning and a specialized dataset to improve style transfer quality and generalization for images and videos.
Findings
Achieves state-of-the-art performance in style similarity and content preservation.
Effectively generalizes to unseen styles through curriculum continual learning.
Enhances temporal consistency in video stylization.
Abstract
Content-preserving style transfer, generating stylized outputs based on content and style references, remains a significant challenge for Diffusion Transformers (DiTs) due to the inherent entanglement of content and style features in their internal representations. In this technical report, we present TeleStyle, a lightweight yet effective model for both image and video stylization. Built upon Qwen-Image-Edit, TeleStyle leverages the base model's robust capabilities in content preservation and style customization. To facilitate effective training, we curated a high-quality dataset of distinct specific styles and further synthesized triplets using thousands of diverse, in-the-wild style categories. We introduce a Curriculum Continual Learning framework to train TeleStyle on this hybrid dataset of clean (curated) and noisy (synthetic) triplets. This approach enables the model to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Computer Graphics and Visualization Techniques
