Improving Image Captioning by Concept-based Sentence Reranking

Xirong Li; Qin Jin

arXiv:1605.00855·cs.CV·May 4, 2016·1 cites

Improving Image Captioning by Concept-based Sentence Reranking

Xirong Li, Qin Jin

PDF

Open Access

TL;DR

This paper presents a concept-based sentence reranking method that enhances image captioning models by leveraging concept annotations, leading to improved performance on the ImageCLEF 2015 benchmark.

Contribution

The paper introduces a black-box compatible reranking approach using concept detection to improve image captioning accuracy, outperforming previous methods.

Findings

01

Achieved a METEOR score of 0.1875 on ImageCLEF 2015 test set.

02

Outperformed the runner-up with a significant margin.

03

Demonstrated the effectiveness of concept-based reranking in image captioning.

Abstract

This paper describes our winning entry in the ImageCLEF 2015 image sentence generation task. We improve Google's CNN-LSTM model by introducing concept-based sentence reranking, a data-driven approach which exploits the large amounts of concept-level annotations on Flickr. Different from previous usage of concept detection that is tailored to specific image captioning models, the propose approach reranks predicted sentences in terms of their matches with detected concepts, essentially treating the underlying model as a black box. This property makes the approach applicable to a number of existing solutions. We also experiment with fine tuning on the deep language model, which improves the performance further. Scoring METEOR of 0.1875 on the ImageCLEF 2015 test set, our system outperforms the runner-up (METEOR of 0.1687) with a clear margin.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling