McSc: Motion-Corrective Preference Alignment for Video Generation with Self-Critic Hierarchical Reasoning

Qiushi Yang; Yingjie Chen; Yuan Yao; Yifang Men; Huaizhuo Liu; Miaomiao Cui

arXiv:2511.22974·cs.CV·December 1, 2025

McSc: Motion-Corrective Preference Alignment for Video Generation with Self-Critic Hierarchical Reasoning

Qiushi Yang, Yingjie Chen, Yuan Yao, Yifang Men, Huaizhuo Liu, Miaomiao Cui

PDF

Open Access

TL;DR

This paper introduces McSc, a reinforcement learning framework that improves text-to-video generation by better modeling human preferences, especially in motion dynamics, through hierarchical reasoning and dynamic bias mitigation.

Contribution

The paper proposes a novel three-stage reinforcement learning approach with hierarchical reasoning and motion correction to enhance preference alignment in video generation.

Findings

01

McSc outperforms existing methods in human preference alignment.

02

Generated videos exhibit higher motion dynamics and visual quality.

03

The framework effectively mitigates bias towards low-motion content.

Abstract

Text-to-video (T2V) generation has achieved remarkable progress in producing high-quality videos aligned with textual prompts. However, aligning synthesized videos with nuanced human preference remains challenging due to the subjective and multifaceted nature of human judgment. Existing video preference alignment methods rely on costly human annotations or utilize proxy metrics to predict preference, which lacks the understanding of human preference logic. Moreover, they usually directly align T2V models with the overall preference distribution, ignoring potential conflict dimensions like motion dynamics and visual quality, which may bias models towards low-motion content. To address these issues, we present Motion-corrective alignment with Self-critic hierarchical Reasoning (McSc), a three-stage reinforcement learning framework for robust preference modeling and alignment. Firstly,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Generative Adversarial Networks and Image Synthesis · Video Analysis and Summarization