Ego3DT: Tracking Every 3D Object in Ego-centric Videos

Shengyu Hao; Wenhao Chai; Zhonghan Zhao; Meiqi Sun; Wendi Hu; Jieyang; Zhou; Yixian Zhao; Qi Li; Yizhou Wang; Xi Li; Gaoang Wang

arXiv:2410.08530·cs.CV·October 14, 2024

Ego3DT: Tracking Every 3D Object in Ego-centric Videos

Shengyu Hao, Wenhao Chai, Zhonghan Zhao, Meiqi Sun, Wendi Hu, Jieyang, Zhou, Yixian Zhao, Qi Li, Yizhou Wang, Xi Li, Gaoang Wang

PDF

Open Access

TL;DR

Ego3DT introduces a zero-shot framework for accurate 3D object reconstruction and tracking in ego-centric videos, overcoming viewing angle variability with dynamic scene modeling and hierarchical association.

Contribution

The paper presents a novel zero-shot 3D reconstruction and tracking method specifically designed for ego-centric videos, incorporating dynamic scene modeling and hierarchical association mechanisms.

Findings

01

Achieved 1.04x to 2.90x improvements in HOTA scores.

02

Demonstrated robustness across diverse ego-centric scenarios.

03

Validated on two newly compiled datasets.

Abstract

The growing interest in embodied intelligence has brought ego-centric perspectives to contemporary research. One significant challenge within this realm is the accurate localization and tracking of objects in ego-centric videos, primarily due to the substantial variability in viewing angles. Addressing this issue, this paper introduces a novel zero-shot approach for the 3D reconstruction and tracking of all objects from the ego-centric video. We present Ego3DT, a novel framework that initially identifies and extracts detection and segmentation information of objects within the ego environment. Utilizing information from adjacent video frames, Ego3DT dynamically constructs a 3D scene of the ego view using a pre-trained 3D scene reconstruction model. Additionally, we have innovated a dynamic hierarchical association mechanism for creating stable 3D tracking trajectories of objects in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation