TL;DR
HyperCap is a large-scale hyperspectral captioning dataset that combines spectral data with textual annotations to improve remote sensing model performance and semantic understanding.
Contribution
It introduces the first hyperspectral captioning dataset with integrated spectral and textual data, facilitating advanced vision-language learning in remote sensing.
Findings
Significant improvements in classification performance using HyperCap.
Demonstrated effectiveness of diverse fusion techniques with the dataset.
Validated the potential of vision-language models in hyperspectral imaging.
Abstract
We introduce HyperCap, the first large-scale hyperspectral captioning dataset designed to enhance model performance and effectiveness in remote sensing applications. Unlike traditional hyperspectral imaging (HSI) benchmarks, HyperCap integrates spectral data with pixel-wise textual annotations, enabling deeper semantic understanding. This dataset enhances model performance in tasks like classification and feature extraction, providing a valuable resource for advanced remote sensing applications. HyperCap is constructed from four benchmark datasets and annotated through a hybrid approach combining automated and manual methods to ensure accuracy and consistency. Empirical evaluations using state-of-the-art encoders and diverse fusion techniques demonstrate significant improvements in classification performance. These results underscore the potential of vision-language learning in HSI and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
