Motion-o: Trajectory-Grounded Video Reasoning

Bishoy Galoaa; Shayda Moezzi; Xiangyu Bai; and Sarah Ostadabbas

arXiv:2603.18856·cs.CV·May 11, 2026

Motion-o: Trajectory-Grounded Video Reasoning

Bishoy Galoaa, Shayda Moezzi, Xiangyu Bai, and Sarah Ostadabbas

PDF

1 Repo 1 Models

TL;DR

Motion-o introduces an explicit, verifiable motion component to video reasoning models, enhancing their ability to reason about object trajectories and dynamic evidence in videos.

Contribution

It formalizes Spatial-Temporal-Trajectory reasoning and extends vision-language models with a structured motion pathway called MCoT, improving trajectory-faithful reasoning.

Findings

01

Motion-o improves trajectory-based reasoning across multiple benchmarks.

02

It enhances interpretability by making object motion explicit and verifiable.

03

The approach does not require architectural changes to existing models.

Abstract

Recent video reasoning models increasingly produce spatio-temporal evidence chains that localize objects at specific timestamps. While these traces improve interpretability by grounding \emph{where} and \emph{when} evidence appears, they often leave the motion connecting observations, the \textit{how}, implicit. This makes dynamic and trajectory-dependent claims difficult to supervise, verify, or penalize when unsupported by the video. We formalize this missing component as Spatial-Temporal-Trajectory (STT) reasoning and introduce \textbf{Motion-o}, a motion-centric extension to vision-language models (VLMs) that makes trajectories explicit and verifiable. Motion-o augments evidence chains with Motion Chain of Thought (MCoT), a structured pathway that represents object motion through a discrete \texttt{<motion/>} tag summarizing direction, speed, and scale change. To supervise MCoT, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ostadabbas/Motion-o
github

Models

🤗
bishoygaloaa/motion-o
model· 1 dl· ♡ 2
1 dl♡ 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.