SD-RSIC: Summarization Driven Deep Remote Sensing Image Captioning

Gencer Sumbul; Sonali Nayak; Beg\"um Demir

arXiv:2006.08432·cs.CV·October 14, 2020

SD-RSIC: Summarization Driven Deep Remote Sensing Image Captioning

Gencer Sumbul, Sonali Nayak, Beg\"um Demir

PDF

TL;DR

This paper introduces SD-RSIC, a novel remote sensing image captioning method that summarizes training captions to reduce redundancy and adaptively combines them with standard captions, improving captioning performance.

Contribution

The paper proposes a new summarization and adaptive weighting strategy for RS image captioning, addressing redundancy in training captions and enhancing model effectiveness.

Findings

01

Outperforms state-of-the-art methods on multiple datasets

02

Effectively reduces redundancy in training captions

03

Improves captioning accuracy and semantic relevance

Abstract

Deep neural networks (DNNs) have been recently found popular for image captioning problems in remote sensing (RS). Existing DNN based approaches rely on the availability of a training set made up of a high number of RS images with their captions. However, captions of training images may contain redundant information (they can be repetitive or semantically similar to each other), resulting in information deficiency while learning a mapping from the image domain to the language domain. To overcome this limitation, in this paper, we present a novel Summarization Driven Remote Sensing Image Captioning (SD-RSIC) approach. The proposed approach consists of three main steps. The first step obtains the standard image captions by jointly exploiting convolutional neural networks (CNNs) with long short-term memory (LSTM) networks. The second step, unlike the existing RS image captioning methods,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory