MobileEgo Anywhere: Open Infrastructure for long horizon egocentric data on commodity hardware
Senthil Palanisamy, Abhishek Anand, Satpal Singh Rathor, Pratyush Patnaik, Shubhanshu Khatana, Ekaksh Janweja

TL;DR
MobileEgo Anywhere introduces a mobile hardware-based framework and dataset for long-duration egocentric data collection, facilitating advancements in Vision Language Action models and robotic policy development.
Contribution
It provides a large-scale egocentric dataset, an open-source processing infrastructure, and a pipeline for transforming mobile captures into training data.
Findings
Released 200 hours of egocentric data with persistent state tracking.
Open sourced the STERA video processing infrastructure.
Enabled scalable, long horizon data collection on commodity hardware.
Abstract
The recent advancement of Vision Language Action (VLA) models has driven a critical demand for large scale egocentric datasets. However, existing datasets are often limited by short episode durations, typically spanning only a few minutes, which fails to capture the long horizon temporal dependencies necessary for complex robotic task execution. To bridge this gap, we present MobileEgo Anywhere, a framework designed to facilitate the collection of robust, hour plus egocentric trajectories using commodity mobile hardware. We leverage the ubiquitous sensor suites of modern smartphones to provide high fidelity, long term camera pose tracking, effectively removing the high hardware barriers associated with traditional robotics data collection. Our contributions are three fold: (1) we release a novel dataset comprising 200 hours of diverse, long form egocentric data with persistent state…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
