Motion Focus Recognition in Fast-Moving Egocentric Video
Si-En Hong, James Tribble, Alexander Lake, Hao Wang, Chaoyi Zhou, Ashish Bastola, Siyu Huang, Eisa Chaudhary, Brian Canada, Ismahan Arslan-Ari, and Abolfazl Razi

TL;DR
This paper introduces a real-time, motion focus recognition method for egocentric videos that estimates locomotion intention, enabling motion analysis in fast-moving scenarios for edge devices.
Contribution
The work presents a novel real-time motion focus recognition approach using foundation models and system optimizations for scalable inference in egocentric videos.
Findings
Achieves real-time performance with manageable memory use.
Effective on a collected egocentric action dataset.
Enables motion-centric analysis for edge deployment.
Abstract
From Vision-Language-Action (VLA) systems to robotics, existing egocentric datasets primarily focus on action recognition tasks, while largely overlooking the inherent role of motion analysis in sports and other fast-movement scenarios. To bridge this gap, we propose a real-time motion focus recognition method that estimates the subject's locomotion intention from any egocentric video. We leverage the foundation model for camera pose estimation and introduce system-level optimizations to enable efficient and scalable inference. Evaluated on a collected egocentric action dataset, our method achieves real-time performance with manageable memory consumption through a sliding batch inference strategy. This work makes motion-centric analysis practical for edge deployment and offers a complementary perspective to existing egocentric studies on sports and fast-movement activities.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
