Single Trajectory Distillation for Accelerating Image and Video Style Transfer
Sijie Xu, Runqi Wang, Wei Zhu, Dejia Song, Nemo Chen, Xu Tang, Yao Hu

TL;DR
This paper introduces Single Trajectory Distillation (STD), a novel method to accelerate diffusion-based image and video style transfer by ensuring whole trajectory consistency, resulting in faster stylization with improved style quality.
Contribution
The paper proposes a new trajectory distillation approach that enforces full trajectory consistency and uses a trajectory bank, significantly speeding up diffusion stylization tasks.
Findings
Outperforms existing models in style similarity
Achieves higher aesthetic quality in stylized images and videos
Reduces computational cost of diffusion-based stylization
Abstract
Diffusion-based stylization methods typically denoise from a specific partial noise state for image-to-image and video-to-video tasks. This multi-step diffusion process is computationally expensive and hinders real-world application. A promising solution to speed up the process is to obtain few-step consistency models through trajectory distillation. However, current consistency models only force the initial-step alignment between the probability flow ODE (PF-ODE) trajectories of the student and the imperfect teacher models. This training strategy can not ensure the consistency of whole trajectories. To address this issue, we propose single trajectory distillation (STD) starting from a specific partial noise state. We introduce a trajectory bank to store the teacher model's trajectory states, mitigating the time cost during training. Besides, we use an asymmetric adversarial loss to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Enhancement Techniques · Advanced Image Processing Techniques
MethodsConsistency Models · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Diffusion
