Deep Learning Aided Vision System for Planetary Rovers

Lomash Relia; Jai G Singla; Amitabh; Nitant Dube

arXiv:2603.26802·cs.CV·March 31, 2026

Deep Learning Aided Vision System for Planetary Rovers

Lomash Relia, Jai G Singla, Amitabh, Nitant Dube

PDF

TL;DR

This paper introduces a scalable, compute-efficient vision system for planetary rovers that combines real-time perception with offline terrain reconstruction, achieving accurate depth estimation and reliable object detection.

Contribution

It integrates stereo imagery, neural network-based depth estimation, and object detection into a unified system for planetary exploration.

Findings

01

Median depth error of 2.26 cm within 1-10 meters

02

Balanced precision-recall on lunar scenes

03

Effective fusion of real-time and offline terrain data

Abstract

This study presents a vision system for planetary rovers, combining real-time perception with offline terrain reconstruction. The real-time module integrates CLAHE enhanced stereo imagery, YOLOv11n based object detection, and a neural network to estimate object distances. The offline module uses the Depth Anything V2 metric monocular depth estimation model to generate depth maps from captured images, which are fused into dense point clouds using Open3D. Real world distance estimates from the real time pipeline provide reliable metric context alongside the qualitative reconstructions. Evaluation on Chandrayaan 3 NavCam stereo imagery, benchmarked against a CAHV based utility, shows that the neural network achieves a median depth error of 2.26 cm within a 1 to 10 meter range. The object detection model maintains a balanced precision recall tradeoff on grayscale lunar scenes. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.