SCoT: Unifying Consistency Models and Rectified Flows via Straight-Consistent Trajectories
Zhangkai Wu, Xuhui Fan, Hongyu Wu, Longbing Cao

TL;DR
SCoT unifies consistency models and rectified flows to produce fast, accurate, and straight trajectories for diffusion model sampling, combining their advantages and improving efficiency.
Contribution
The paper introduces SCoT, a novel model that unifies consistency and rectified flow methods to enhance sampling speed and accuracy in diffusion models.
Findings
SCoT achieves faster sampling with high-quality results.
SCoT maintains trajectory consistency and straightness.
Experimental results validate SCoT's effectiveness and efficiency.
Abstract
Pre-trained diffusion models are commonly used to generate clean data (e.g., images) from random noises, effectively forming pairs of noises and corresponding clean images. Distillation on these pre-trained models can be viewed as the process of constructing advanced trajectories within the pair to accelerate sampling. For instance, consistency model distillation develops consistent projection functions to regulate trajectories, although sampling efficiency remains a concern. Rectified flow method enforces straight trajectories to enable faster sampling, yet relies on numerical ODE solvers, which may introduce approximation errors. In this work, we bridge the gap between the consistency model and the rectified flow method by proposing a Straight Consistent Trajectory~(SCoT) model. SCoT enjoys the benefits of both approaches for fast sampling, producing trajectories with consistent and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Semantic Web and Ontologies · Transportation and Mobility Innovations
MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Focus
