QACE: Asking Questions to Evaluate an Image Caption

Hwanhee Lee; Thomas Scialom; Seunghyun Yoon; Franck Dernoncourt,; Kyomin Jung

arXiv:2108.12560·cs.CL·August 31, 2021

QACE: Asking Questions to Evaluate an Image Caption

Hwanhee Lee, Thomas Scialom, Seunghyun Yoon, Franck Dernoncourt,, Kyomin Jung

PDF

1 Repo

TL;DR

QACE is a novel question-answering based metric for image caption evaluation that compares captions with references or directly with images, utilizing a new Visual-T5 model for improved accuracy and explainability.

Contribution

The paper introduces QACE, a new metric for caption evaluation that employs question answering, including a novel Visual-T5 model for reference-less image assessment.

Findings

01

QACE-Ref achieves competitive results with state-of-the-art metrics.

02

QACE-Img outperforms other reference-less metrics.

03

The Visual-T5 model enables effective multi-modal question answering.

Abstract

In this paper, we propose QACE, a new metric based on Question Answering for Caption Evaluation. QACE generates questions on the evaluated caption and checks its content by asking the questions on either the reference caption or the source image. We first develop QACE-Ref that compares the answers of the evaluated caption to its reference, and report competitive results with the state-of-the-art metrics. To go further, we propose QACE-Img, which asks the questions directly on the image, instead of reference. A Visual-QA system is necessary for QACE-Img. Unfortunately, the standard VQA models are framed as a classification among only a few thousand categories. Instead, we propose Visual-T5, an abstractive VQA system. The resulting metric, QACE-Img is multi-modal, reference-less, and explainable. Our experiments show that QACE-Img compares favorably w.r.t. other reference-less metrics. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hwanheelee1993/qace
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.