Towards Automatic Satellite Images Captions Generation Using Large   Language Models

Yingxu He; Qiqi Sun

arXiv:2310.11392·cs.CV·October 18, 2023·1 cites

Towards Automatic Satellite Images Captions Generation Using Large Language Models

Yingxu He, Qiqi Sun

PDF

Open Access

TL;DR

This paper introduces ARSIC, a novel method that leverages large language models to automatically generate captions for satellite images, addressing dataset scarcity and improving caption quality for remote sensing applications.

Contribution

It proposes a new approach to automatically generate captions for satellite images using LLMs and adapts existing models for high-quality remote sensing image captioning.

Findings

01

Effective automatic caption collection for remote sensing images

02

Improved caption quality over conventional models

03

Demonstrated potential for large-scale dataset creation

Abstract

Automatic image captioning is a promising technique for conveying visual information using natural language. It can benefit various tasks in satellite remote sensing, such as environmental monitoring, resource management, disaster management, etc. However, one of the main challenges in this domain is the lack of large-scale image-caption datasets, as they require a lot of human expertise and effort to create. Recent research on large language models (LLMs) has demonstrated their impressive performance in natural language understanding and generation tasks. Nonetheless, most of them cannot handle images (GPT-3.5, Falcon, Claude, etc.), while conventional captioning models pre-trained on general ground-view images often fail to produce detailed and accurate captions for aerial images (BLIP, GIT, CM3, CM3Leon, etc.). To address this problem, we propose a novel approach: Automatic Remote…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling