Measuring and Mitigating Hallucinations in Vision-Language Dataset Generation for Remote Sensing
Madeline Anderson, Miriam Cha, William T. Freeman, J. Taylor Perron,, Nathaniel Maidel, Kerri Cahoy

TL;DR
This paper introduces a new dataset and methods to generate detailed, context-rich captions for remote sensing images using large language models, while measuring and reducing hallucinations to improve dataset quality and recognition performance.
Contribution
It presents a novel approach integrating maps with LLMs for caption generation, along with techniques to measure and mitigate hallucinations in remote sensing datasets.
Findings
The fMoW-mm dataset effectively combines satellite images, maps, and annotations.
Enhanced caption quality improves target recognition in few-shot learning.
Methods successfully reduce hallucinations in LLM-generated captions.
Abstract
Vision language models have achieved impressive results across various fields. However, adoption in remote sensing remains limited, largely due to the scarcity of paired image-text data. To bridge this gap, synthetic caption generation has gained interest, traditionally relying on rule-based methods that use metadata or bounding boxes. While these approaches provide some description, they often lack the depth needed to capture complex wide-area scenes. Large language models (LLMs) offer a promising alternative for generating more descriptive captions, yet they can produce generic outputs and are prone to hallucination. In this paper, we propose a new method to enhance vision-language datasets for remote sensing by integrating maps as external data sources, enabling the generation of detailed, context-rich captions. Additionally, we present methods to measure and mitigate hallucinations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHallucinations in medical conditions
