Motor Focus: Fast Ego-Motion Prediction for Assistive Visual Navigation
Hao Wang, Jiayou Qin, Xiwen Chen, Ashish Bastola, John Suchanek, Zihao, Gong, Abolfazl Razi

TL;DR
Motor Focus is a fast, lightweight visual framework that accurately predicts human ego-motion in complex scenes, enhancing assistive navigation for visually impaired users without requiring camera calibration.
Contribution
This paper introduces Motor Focus, a novel image-based ego-motion prediction method that filters camera motion without calibration, improving speed and robustness in assistive navigation.
Findings
Achieves over 40 FPS speed
Demonstrates MAE of 60 pixels in accuracy
Shows robustness with SNR of 23 dB
Abstract
Assistive visual navigation systems for visually impaired individuals have become increasingly popular thanks to the rise of mobile computing. Most of these devices work by translating visual information into voice commands. In complex scenarios where multiple objects are present, it is imperative to prioritize object detection and provide immediate notifications for key entities in specific directions. This brings the need for identifying the observer's motion direction (ego-motion) by merely processing visual information, which is the key contribution of this paper. Specifically, we introduce Motor Focus, a lightweight image-based framework that predicts the ego-motion - the humans (and humanoid machines) movement intentions based on their visual feeds, while filtering out camera motion without any camera calibration. To this end, we implement an optical flow-based pixel-wise temporal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Virtual Reality Applications and Impacts · Human Pose and Action Recognition
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Focus
