Ego-1K -- A Large-Scale Multiview Video Dataset for Egocentric Vision

Jae Yong Lee; Daniel Scharstein; Akash Bapat; Hao Hu; Andrew Fu; Haoru Zhao; Paul Sammut; Xiang Li; Stephen Jeapes; Anik Gupta; Lior David; Saketh Madhuvarasu; Jay Girish Joshi; and Jason Wither

arXiv:2603.13741·cs.CV·March 17, 2026

Ego-1K -- A Large-Scale Multiview Video Dataset for Egocentric Vision

Jae Yong Lee, Daniel Scharstein, Akash Bapat, Hao Hu, Andrew Fu, Haoru Zhao, Paul Sammut, Xiang Li, Stephen Jeapes, Anik Gupta, Lior David, Saketh Madhuvarasu, Jay Girish Joshi, and Jason Wither

PDF

Open Access 1 Datasets

TL;DR

Ego-1K is a large-scale multiview egocentric video dataset designed to facilitate research in neural 3D video synthesis and dynamic scene understanding, especially for smart glasses with multiple cameras.

Contribution

The paper introduces Ego-1K, a novel dataset with synchronized multiview egocentric videos capturing hand interactions, enabling benchmarking and advancing egocentric scene reconstruction methods.

Findings

01

Existing 3D and 4D view synthesis methods face challenges with Ego-1K due to large disparities and motion.

02

The dataset highlights the need for improved algorithms to handle egocentric view complexities.

03

Ego-1K supports future research in egocentric vision and scene reconstruction.

Abstract

We present Ego-1K, a large-scale collection of time-synchronized egocentric multiview videos designed to advance neural 3D video synthesis and dynamic scene understanding. The dataset contains nearly 1,000 short egocentric videos captured with a custom rig with 12 synchronized cameras surrounding a 4-camera VR headset worn by the user. Scene content focuses on hand motions and hand-object interactions in different settings. We describe rig design, data processing, and calibration. Our dataset enables new ways to benchmark egocentric scene reconstruction methods, an important research area as smart glasses with multiple cameras become omnipresent. Our experiments demonstrate that our dataset presents unique challenges for existing 3D and 4D novel view synthesis methods due to large disparities and image motion caused by close dynamic objects and rig egomotion. Our dataset supports future…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

facebook/ego-1k
dataset· 9.6k dl
9.6k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Human Pose and Action Recognition