Shortcutting Pre-trained Flow Matching Diffusion Models is Almost Free Lunch

Xu Cai; Yang Wu; Qianli Chen; Haoran Wu; Lichuan Xiang; Hongkai Wen

arXiv:2510.17858·cs.CV·October 22, 2025

Shortcutting Pre-trained Flow Matching Diffusion Models is Almost Free Lunch

Xu Cai, Yang Wu, Qianli Chen, Haoran Wu, Lichuan Xiang, Hongkai Wen

PDF

Open Access 2 Models

TL;DR

This paper introduces a highly efficient post-training method for transforming large pre-trained flow matching diffusion models into few-step samplers using velocity field self-distillation, significantly reducing computational costs.

Contribution

The authors propose a novel velocity field self-distillation technique that enables aggressive shortcutting in flow matching models without retraining, improving efficiency and enabling few-shot distillation.

Findings

01

Achieved 3-step Flux sampling in less than one A100 day.

02

Enabled few-shot distillation with as few as 10 text-image pairs.

03

Produced state-of-the-art performance at minimal cost.

Abstract

We present an ultra-efficient post-training method for shortcutting large-scale pre-trained flow matching diffusion models into efficient few-step samplers, enabled by novel velocity field self-distillation. While shortcutting in flow matching, originally introduced by shortcut models, offers flexible trajectory-skipping capabilities, it requires a specialized step-size embedding incompatible with existing models unless retraining from scratch $\unicode x 2013$ a process nearly as costly as pretraining itself. Our key contribution is thus imparting a more aggressive shortcut mechanism to standard flow matching models (e.g., Flux), leveraging a unique distillation principle that obviates the need for step-size embedding. Working on the velocity field rather than sample space and learning rapidly from self-guided distillation in an online manner, our approach trains efficiently, e.g.,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Stochastic Gradient Optimization Techniques