Motion Forcing: A Decoupled Framework for Robust Video Generation in Motion Dynamics

Tianshuo Xu; Zhifei Chen; Leyi Wu; Hao Lu; Ying-cong Chen

arXiv:2603.10408·cs.CV·March 12, 2026

Motion Forcing: A Decoupled Framework for Robust Video Generation in Motion Dynamics

Tianshuo Xu, Zhifei Chen, Leyi Wu, Hao Lu, Ying-cong Chen

PDF

Open Access 1 Models

TL;DR

This paper introduces Motion Forcing, a hierarchical framework that decouples physical reasoning from visual synthesis to improve robustness and physical consistency in complex video generation tasks.

Contribution

It proposes a novel Point-Shape-Appearance paradigm and masked point recovery strategy to enhance physical understanding and stability in video generation.

Findings

01

Outperforms state-of-the-art methods on autonomous driving benchmarks

02

Maintains physical consistency in complex scenes with collisions or dense traffic

03

Demonstrates generality across physics and robotics applications

Abstract

The ultimate goal of video generation is to satisfy a fundamental trilemma: achieving high visual quality, maintaining rigorous physical consistency, and enabling precise controllability. While recent models can maintain this balance in simple, isolated scenarios, we observe that this equilibrium is fragile and often breaks down as scene complexity increases (e.g., involving collisions or dense traffic). To address this, we introduce \textbf{Motion Forcing}, a framework designed to stabilize this trilemma even in complex generative tasks. Our key insight is to explicitly decouple physical reasoning from visual synthesis via a hierarchical \textbf{``Point-Shape-Appearance''} paradigm. This approach decomposes generation into verifiable stages: modeling complex dynamics as sparse geometric anchors (\textbf{Point}), expanding them into dynamic depth maps that explicitly resolve 3D geometry…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
TSXu/MotionForcing_driving
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Human Motion and Animation