Diffusion-APO: Trajectory-Aware Direct Preference Alignment for Video Diffusion Transformers

Jingyuan Zhu; Biaolong Chen; Le Zhang; Aixi Zhang; Hao Jiang; Pipei Huang

arXiv:2605.07503·cs.CV·May 11, 2026

Diffusion-APO: Trajectory-Aware Direct Preference Alignment for Video Diffusion Transformers

Jingyuan Zhu, Biaolong Chen, Le Zhang, Aixi Zhang, Hao Jiang, Pipei Huang

PDF

TL;DR

Diffusion-APO introduces a trajectory-aware preference alignment method for video diffusion models, improving visual quality and instruction following without relying on scalar rewards.

Contribution

It presents a novel trajectory-aware algorithm and a modular RLHF framework that enhance scalable preference alignment in video diffusion transformers.

Findings

01

Outperforms standard baselines in visual quality and instruction following.

02

Effectively preserves generative fidelity during model acceleration.

03

Provides a scalable, end-to-end pathway for video diffusion alignment.

Abstract

Efficiently aligning large-scale video diffusion models with human intent requires a scalable and trajectory-aware pathway that bridges the inherent discrepancy between training noise distributions and practical inference trajectories. While existing paradigms such as Direct Preference Optimization (DPO) and Group Relative Policy Optimization (GRPO) attempt to address this, they are often hindered by either reliance on bias-prone, complex reward models or suboptimal timestep sampling. In this paper, we propose Diffusion-APO (Aligned Preference Optimization), a trajectory-aware algorithm that resolves this misalignment by synchronizing training noise with inference-time denoising paths to maximize gradient signal efficacy. To translate this algorithmic innovation into a practical solution, we introduce a unified and modular RLHF framework that integrates online ranking, half-online…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.