Reward-Aware Trajectory Shaping for Few-step Visual Generation
Rui Li, Bingyu Li, Yuanzhi Liang, Haibin Huang, Chi Zhang, XueLong Li

TL;DR
This paper introduces RATS, a framework that improves few-step visual generation by aligning trajectories and using reward-aware gating, enabling students to outperform teachers without extra computation.
Contribution
RATS is a novel preference-aligned trajectory shaping method that surpasses traditional distillation by focusing on reward-driven quality in few-step generation.
Findings
RATS significantly narrows the gap between few-step and multi-step generators.
It improves the efficiency-quality trade-off in visual generation tasks.
Experimental results validate the effectiveness of reward-aware trajectory shaping.
Abstract
Achieving high-fidelity generation in extremely few sampling steps has long been a central goal of generative modeling. Existing approaches largely rely on distillation-based frameworks to compress the original multi-step denoising process into a few-step generator. However, such methods inherently constrain the student to imitate a stronger multi-step teacher, imposing the teacher as an upper bound on student performance. We argue that introducing \textbf{preference alignment awareness} enables the student to optimize toward reward-preferred generation quality, potentially surpassing the teacher instead of being restricted to rigid teacher imitation. To this end, we propose \textbf{Reward-Aware Trajectory Shaping (RATS)}, a lightweight framework for preference-aligned few-step generation. Specifically, teacher and student latent trajectories are aligned at key denoising stages through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
