Improved Training Technique for Shortcut Models

Anh Nguyen; Viet Nguyen; Duc Vu; Trung Dao; Chi Tran; Toan Tran; Anh Tran

arXiv:2510.21250·cs.CV·October 27, 2025

Improved Training Technique for Shortcut Models

Anh Nguyen, Viet Nguyen, Duc Vu, Trung Dao, Chi Tran, Toan Tran, Anh Tran

PDF

Open Access 1 Video

TL;DR

This paper introduces iSM, a comprehensive training framework that overcomes key limitations of shortcut models, enabling high-quality, flexible, and stable generative modeling across various sampling steps.

Contribution

The paper presents iSM, a unified training method with four innovations that significantly improves shortcut models' performance and stability for generative tasks.

Findings

01

Substantial FID improvements on ImageNet 256x256

02

Enhanced high-frequency detail preservation

03

Stable multi-step generation performance

Abstract

Shortcut models represent a promising, non-adversarial paradigm for generative modeling, uniquely supporting one-step, few-step, and multi-step sampling from a single trained network. However, their widespread adoption has been stymied by critical performance bottlenecks. This paper tackles the five core issues that held shortcut models back: (1) the hidden flaw of compounding guidance, which we are the first to formalize, causing severe image artifacts; (2) inflexible fixed guidance that restricts inference-time control; (3) a pervasive frequency bias driven by a reliance on low-level distances in the direct domain, which biases reconstructions toward low frequencies; (4) divergent self-consistency arising from a conflict with EMA training; and (5) curvy flow trajectories that impede convergence. To address these challenges, we introduce iSM, a unified training framework that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Improved Training Technique for Shortcut Models· slideslive

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications