TL;DR
This paper introduces a new RGB-thermal calibration method, a comprehensive dataset with annotations, and a CNN for semantic segmentation that effectively combines RGB and thermal data, advancing multi-modal perception.
Contribution
It provides a novel passive calibration target, a new dataset with synchronized RGB-Thermal images and annotations, and a CNN architecture that leverages both modalities for improved segmentation.
Findings
Our calibration method is portable and easy to use.
The PST900 dataset contains 894 synchronized RGB-Thermal image pairs.
Our segmentation network outperforms state-of-the-art methods on the dataset.
Abstract
In this work we propose long wave infrared (LWIR) imagery as a viable supporting modality for semantic segmentation using learning-based techniques. We first address the problem of RGB-thermal camera calibration by proposing a passive calibration target and procedure that is both portable and easy to use. Second, we present PST900, a dataset of 894 synchronized and calibrated RGB and Thermal image pairs with per pixel human annotations across four distinct classes from the DARPA Subterranean Challenge. Lastly, we propose a CNN architecture for fast semantic segmentation that combines both RGB and Thermal imagery in a way that leverages RGB imagery independently. We compare our method against the state-of-the-art and show that our method outperforms them in our dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
