Real-Time Multi-Modal Semantic Fusion on Unmanned Aerial Vehicles with Label Propagation for Cross-Domain Adaptation
Simon Bultmann, Jan Quenzel, Sven Behnke

TL;DR
This paper presents a real-time UAV system that fuses multi-modal sensor data for semantic scene understanding, utilizing label propagation for cross-domain adaptation, achieving high-frequency semantic mapping in complex environments.
Contribution
The work introduces a lightweight, real-time multi-modal semantic fusion system on UAVs with a novel label propagation method for cross-domain sensor adaptation.
Findings
Achieves approximately 9 Hz semantic inference and fusion onboard UAVs.
Demonstrates effective semantic mapping in urban and disaster environments.
Validates system performance through real-world experiments.
Abstract
Unmanned aerial vehicles (UAVs) equipped with multiple complementary sensors have tremendous potential for fast autonomous or remote-controlled semantic scene analysis, e.g., for disaster examination. Here, we propose a UAV system for real-time semantic inference and fusion of multiple sensor modalities. Semantic segmentation of LiDAR scans and RGB images, as well as object detection on RGB and thermal images, run online onboard the UAV computer using lightweight CNN architectures and embedded inference accelerators. We follow a late fusion approach where semantic information from multiple sensor modalities augments 3D point clouds and image segmentation masks while also generating an allocentric semantic map. Label propagation on the semantic map allows for sensor-specific adaptation with cross-modality and cross-domain supervision. Our system provides augmented semantic images and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest
