ICECAP: Information Concentrated Entity-aware Image Captioning

Anwen Hu; Shizhe Chen; Qin Jin

arXiv:2108.02050·cs.CV·August 5, 2021

ICECAP: Information Concentrated Entity-aware Image Captioning

Anwen Hu, Shizhe Chen, Qin Jin

PDF

1 Repo

TL;DR

ICECAP is a novel image captioning model that leverages news articles at multiple levels of detail to generate more informative and entity-aware captions, outperforming existing methods.

Contribution

The paper introduces ICECAP, a progressive concentration approach that refines relevant news information from sentence to word level for improved image captioning.

Findings

01

Outperforms state-of-the-art methods on BreakingNews and GoodNews datasets

02

Effectively concentrates on relevant textual information at multiple levels

03

Demonstrates significant improvements in caption informativeness and accuracy

Abstract

Most current image captioning systems focus on describing general image content, and lack background knowledge to deeply understand the image, such as exact named entities or concrete events. In this work, we focus on the entity-aware news image captioning task which aims to generate informative captions by leveraging the associated news articles to provide background knowledge about the target image. However, due to the length of news articles, previous works only employ news articles at the coarse article or sentence level, which are not fine-grained enough to refine relevant events and choose named entities accurately. To overcome these limitations, we propose an Information Concentrated Entity-aware news image CAPtioning (ICECAP) model, which progressively concentrates on relevant textual information within the corresponding news article from the sentence level to the word level.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HAWLYQ/ICECAP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.