Few-shot Semantic Learning for Robust Multi-Biome 3D Semantic Mapping in Off-Road Environments
Deegan Atha, Xianmei Lei, Shehryar Khattak, Anna Sabel, Elle Miller,, Aurelio Noca, Grace Lim, Jeffrey Edlund, Curtis Padgett, Patrick Spieler

TL;DR
This paper introduces a few-shot learning approach using a pre-trained Vision Transformer for robust 3D semantic mapping in off-road environments, enabling effective segmentation with minimal labeled data and handling complex terrain hazards.
Contribution
It presents a novel few-shot semantic segmentation method that leverages a pre-trained ViT and a range-based fusion technique for 3D mapping in challenging off-road conditions.
Findings
Zero-shot segmentation achieves 52.9-55.5 mIoU on Yamaha and Rellis datasets.
Few-shot fine-tuning improves mIoU to 66.6-67.2 on the same datasets.
The approach effectively detects off-road hazards like water and overhangs.
Abstract
Off-road environments pose significant perception challenges for high-speed autonomous navigation due to unstructured terrain, degraded sensing conditions, and domain-shifts among biomes. Learning semantic information across these conditions and biomes can be challenging when a large amount of ground truth data is required. In this work, we propose an approach that leverages a pre-trained Vision Transformer (ViT) with fine-tuning on a small (<500 images), sparse and coarsely labeled (<30% pixels) multi-biome dataset to predict 2D semantic segmentation classes. These classes are fused over time via a novel range-based metric and aggregated into a 3D semantic voxel map. We demonstrate zero-shot out-of-biome 2D semantic segmentation on the Yamaha (52.9 mIoU) and Rellis (55.5 mIoU) datasets along with few-shot coarse sparse labeling with existing data for improved segmentation performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Digital Imaging for Blood Diseases · Human Pose and Action Recognition
MethodsAttention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Absolute Position Encodings · Layer Normalization · Position-Wise Feed-Forward Layer · Adam · Multi-Head Attention · Residual Connection
