Depth-Aware Rover: A Study of Edge AI and Monocular Vision for Real-World Implementation
Lomash Relia, Jai G Singla, Amitabh, Nitant Dube

TL;DR
This paper compares stereo and monocular vision methods for rover navigation, demonstrating that monocular edge AI offers a robust, cost-effective solution for real-world applications despite lower accuracy.
Contribution
It presents a practical implementation of monocular depth estimation on a rover, transitioning from stereo vision in simulation to edge AI in real-world deployment.
Findings
Stereo vision achieved higher accuracy in simulation.
Monocular approach was more robust and cost-effective in real-world use.
Real-time object detection was achieved at 10 FPS.
Abstract
This study analyses simulated and real-world implementations of depth-aware rover navigation, highlighting the transition from stereo vision to monocular depth estimation using edge AI. A Unity-based lunar terrain simulator with stereo cameras and OpenCV's StereoSGBM was used to generate disparity maps. A physical rover built on Raspberry Pi 4 employed UniDepthV2 for monocular metric depth estimation and YOLO12n for real-time object detection. While stereo vision yielded higher accuracy in simulation, the monocular approach proved more robust and cost-effective in real-world deployment, achieving 0.1 FPS for depth and 10 FPS for detection.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
