Ego3DT: Tracking Every 3D Object in Ego-centric Videos
Shengyu Hao, Wenhao Chai, Zhonghan Zhao, Meiqi Sun, Wendi Hu, Jieyang, Zhou, Yixian Zhao, Qi Li, Yizhou Wang, Xi Li, Gaoang Wang

TL;DR
Ego3DT introduces a zero-shot framework for accurate 3D object reconstruction and tracking in ego-centric videos, overcoming viewing angle variability with dynamic scene modeling and hierarchical association.
Contribution
The paper presents a novel zero-shot 3D reconstruction and tracking method specifically designed for ego-centric videos, incorporating dynamic scene modeling and hierarchical association mechanisms.
Findings
Achieved 1.04x to 2.90x improvements in HOTA scores.
Demonstrated robustness across diverse ego-centric scenarios.
Validated on two newly compiled datasets.
Abstract
The growing interest in embodied intelligence has brought ego-centric perspectives to contemporary research. One significant challenge within this realm is the accurate localization and tracking of objects in ego-centric videos, primarily due to the substantial variability in viewing angles. Addressing this issue, this paper introduces a novel zero-shot approach for the 3D reconstruction and tracking of all objects from the ego-centric video. We present Ego3DT, a novel framework that initially identifies and extracts detection and segmentation information of objects within the ego environment. Utilizing information from adjacent video frames, Ego3DT dynamically constructs a 3D scene of the ego view using a pre-trained 3D scene reconstruction model. Additionally, we have innovated a dynamic hierarchical association mechanism for creating stable 3D tracking trajectories of objects in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation
