Few-shot Semantic Learning for Robust Multi-Biome 3D Semantic Mapping in   Off-Road Environments

Deegan Atha; Xianmei Lei; Shehryar Khattak; Anna Sabel; Elle Miller,; Aurelio Noca; Grace Lim; Jeffrey Edlund; Curtis Padgett; Patrick Spieler

arXiv:2411.06632·cs.CV·February 27, 2025

Few-shot Semantic Learning for Robust Multi-Biome 3D Semantic Mapping in Off-Road Environments

Deegan Atha, Xianmei Lei, Shehryar Khattak, Anna Sabel, Elle Miller,, Aurelio Noca, Grace Lim, Jeffrey Edlund, Curtis Padgett, Patrick Spieler

PDF

Open Access

TL;DR

This paper introduces a few-shot learning approach using a pre-trained Vision Transformer for robust 3D semantic mapping in off-road environments, enabling effective segmentation with minimal labeled data and handling complex terrain hazards.

Contribution

It presents a novel few-shot semantic segmentation method that leverages a pre-trained ViT and a range-based fusion technique for 3D mapping in challenging off-road conditions.

Findings

01

Zero-shot segmentation achieves 52.9-55.5 mIoU on Yamaha and Rellis datasets.

02

Few-shot fine-tuning improves mIoU to 66.6-67.2 on the same datasets.

03

The approach effectively detects off-road hazards like water and overhangs.

Abstract

Off-road environments pose significant perception challenges for high-speed autonomous navigation due to unstructured terrain, degraded sensing conditions, and domain-shifts among biomes. Learning semantic information across these conditions and biomes can be challenging when a large amount of ground truth data is required. In this work, we propose an approach that leverages a pre-trained Vision Transformer (ViT) with fine-tuning on a small (<500 images), sparse and coarsely labeled (<30% pixels) multi-biome dataset to predict 2D semantic segmentation classes. These classes are fused over time via a novel range-based metric and aggregated into a 3D semantic voxel map. We demonstrate zero-shot out-of-biome 2D semantic segmentation on the Yamaha (52.9 mIoU) and Rellis (55.5 mIoU) datasets along with few-shot coarse sparse labeling with existing data for improved segmentation performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Digital Imaging for Blood Diseases · Human Pose and Action Recognition

MethodsAttention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Absolute Position Encodings · Layer Normalization · Position-Wise Feed-Forward Layer · Adam · Multi-Head Attention · Residual Connection