Understanding Dynamic Scenes in Ego Centric 4D Point Clouds
Junsheng Huang, Shengyu Hao, Bocheng Hu, Hongwei Wang, Gaoang Wang

TL;DR
This paper introduces EgoDynamic4D, a comprehensive benchmark with 927K QA pairs and a reasoning framework for understanding complex, dynamic 4D scenes from an egocentric perspective, addressing a key gap in spatio-temporal scene understanding.
Contribution
The paper presents a new large-scale egocentric 4D scene dataset with detailed annotations and a novel reasoning framework that unifies dynamic and static scene information for improved understanding.
Findings
Our method outperforms baselines on multiple tasks.
The framework effectively models multimodal temporal information.
EgoDynamic4D enables detailed spatio-temporal reasoning in dynamic scenes.
Abstract
Understanding dynamic 4D scenes from an egocentric perspective-modeling changes in 3D spatial structure over time-is crucial for human-machine interaction, autonomous navigation, and embodied intelligence. While existing egocentric datasets contain dynamic scenes, they lack unified 4D annotations and task-driven evaluation protocols for fine-grained spatio-temporal reasoning, especially on motion of objects and human, together with their interactions. To address this gap, we introduce EgoDynamic4D, a novel QA benchmark on highly dynamic scenes, comprising RGB-D video, camera poses, globally unique instance masks, and 4D bounding boxes. We construct 927K QA pairs accompanied by explicit Chain-of-Thought (CoT), enabling verifiable, step-by-step spatio-temporal reasoning. We design 12 dynamic QA tasks covering agent motion, human-object interaction, trajectory prediction, relation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Human Motion and Animation · Autonomous Vehicle Technology and Safety
