Duality Models: An Embarrassingly Simple One-step Generation Paradigm
Peng Sun, Xinyi Shang, Tao Lin, Zhiqiang Shen

TL;DR
The paper introduces Duality Models (DuMo), a simple one-step generation paradigm that predicts velocity and flow-map simultaneously, improving stability and efficiency in generative models, achieving state-of-the-art results on ImageNet.
Contribution
DuMo presents a novel 'one input, dual output' approach with shared backbone, unifying multi-step and few-step objectives for better stability and scalability.
Findings
Achieves SOTA FID of 1.79 on ImageNet 256x256 in 2 steps
Significantly improves training stability and efficiency
Unifies multi-step and few-step objectives in a single model
Abstract
Consistency-based generative models like Shortcut and MeanFlow achieve impressive results via a target-aware design for solving the Probability Flow ODE (PF-ODE). Typically, such methods introduce a target time alongside the current time to modulate outputs between a local multi-step derivative () and a global few-step integral (). However, the conventional "one input, one output" paradigm enforces a partition of the training budget, often allocating a significant portion (e.g., 75% in MeanFlow) solely to the multi-step objective for stability. This separation forces a trade-off: allocating sufficient samples to the multi-step objective leaves the few-step generation undertrained, which harms convergence and limits scalability. To this end, we propose Duality Models (DuMo) via a "one input, dual output" paradigm. Using a shared backbone with dual heads, DuMo…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Tensor decomposition and applications · Model Reduction and Neural Networks
