SwiftVideo: A Unified Framework for Few-Step Video Generation through Trajectory-Distribution Alignment

Yanxiao Sun; Jiafu Wu; Yun Cao; Chengming Xu; Yabiao Wang; Weijian Cao; Donghao Luo; Chengjie Wang; Yanwei Fu

arXiv:2508.06082·cs.CV·September 18, 2025

SwiftVideo: A Unified Framework for Few-Step Video Generation through Trajectory-Distribution Alignment

Yanxiao Sun, Jiafu Wu, Yun Cao, Chengming Xu, Yabiao Wang, Weijian Cao, Donghao Luo, Chengjie Wang, Yanwei Fu

PDF

Open Access 1 Video

TL;DR

SwiftVideo is a novel distillation framework that combines trajectory-preserving and distribution-matching strategies to enable high-quality, few-step video generation with reduced computational cost.

Contribution

It introduces continuous-time consistency distillation and dual-perspective alignment to improve stability and performance in few-step video synthesis.

Findings

01

Outperforms existing methods on OpenVid-1M benchmark

02

Maintains high video quality with fewer inference steps

03

Reduces computational overhead significantly

Abstract

Diffusion-based or flow-based models have achieved significant progress in video synthesis but require multiple iterative sampling steps, which incurs substantial computational overhead. While many distillation methods that are solely based on trajectory-preserving or distribution-matching have been developed to accelerate video generation models, these approaches often suffer from performance breakdown or increased artifacts under few-step settings. To address these limitations, we propose \textbf{\emph{SwiftVideo}}, a unified and stable distillation framework that combines the advantages of trajectory-preserving and distribution-matching strategies. Our approach introduces continuous-time consistency distillation to ensure precise preservation of ODE trajectories. Subsequently, we propose a dual-perspective alignment that includes distribution alignment between synthetic and real data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SwiftVideo: A Unified Framework for Few-Step Video Generation Through Trajectory-Distribution Alignment· underline

Taxonomy

TopicsVideo Analysis and Summarization · Video Coding and Compression Technologies · Advanced Vision and Imaging