T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching
Zizheng Pan, Bohan Zhuang, De-An Huang, Weili Nie, Zhiding Yu, Chaowei, Xiao, Jianfei Cai, Anima Anandkumar

TL;DR
T-Stitch is a training-free method that accelerates diffusion model sampling by switching from a smaller to a larger model during the process, maintaining quality while reducing computation.
Contribution
It introduces a novel trajectory stitching technique that improves sampling efficiency without retraining, applicable across different architectures and models.
Findings
40% of early timesteps replaced with 10x faster model without quality loss
Applicable to various architectures and models, including stable diffusion
Enhances prompt alignment in stylized diffusion models
Abstract
Sampling from diffusion probabilistic models (DPMs) is often expensive for high-quality image generation and typically requires many steps with a large model. In this paper, we introduce sampling Trajectory Stitching T-Stitch, a simple yet efficient technique to improve the sampling efficiency with little or no generation degradation. Instead of solely using a large DPM for the entire sampling trajectory, T-Stitch first leverages a smaller DPM in the initial steps as a cheap drop-in replacement of the larger DPM and switches to the larger DPM at a later stage. Our key insight is that different diffusion models learn similar encodings under the same training data distribution and smaller models are capable of generating good global structures in the early steps. Extensive experiments demonstrate that T-Stitch is training-free, generally applicable for different architectures, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neuroimaging Techniques and Applications · Markov Chains and Monte Carlo Methods
MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
