OmniUnet: A Multimodal Network for Unstructured Terrain Segmentation on Planetary Rovers Using RGB, Depth, and Thermal Imagery
Raul Castilla-Arquillo, Carlos Perez-del-Pulgar, Levin Gerdes, Alfonso Garcia-Cerezo, Miguel A. Olivares-Mendez

TL;DR
OmniUnet is a transformer-based neural network that integrates RGB, depth, and thermal imagery for accurate terrain segmentation on planetary rovers, demonstrated with a custom dataset and real-time inference capability.
Contribution
This work introduces OmniUnet, a novel multimodal neural network architecture specifically designed for unstructured terrain segmentation in planetary exploration, utilizing a custom dataset and real-time processing.
Findings
Achieved 80.37% pixel accuracy in terrain segmentation.
Demonstrated inference time of 673 ms on resource-constrained hardware.
Developed and publicly released a multimodal dataset and software for planetary terrain perception.
Abstract
Robot navigation in unstructured environments requires multimodal perception systems that can support safe navigation. Multimodality enables the integration of complementary information collected by different sensors. However, this information must be processed by machine learning algorithms specifically designed to leverage heterogeneous data. Furthermore, it is necessary to identify which sensor modalities are most informative for navigation in the target environment. In Martian exploration, thermal imagery has proven valuable for assessing terrain safety due to differences in thermal behaviour between soil types. This work presents OmniUnet, a transformer-based neural network architecture for semantic segmentation using RGB, depth, and thermal (RGB-D-T) imagery. A custom multimodal sensor housing was developed using 3D printing and mounted on the Martian Rover Testbed for Autonomy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
