REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image   Captioning

Ming Jiang; Junjie Hu; Qiuyuan Huang; Lei Zhang; Jana Diesner,; Jianfeng Gao

arXiv:1909.02217·cs.CL·September 6, 2019

REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image Captioning

Ming Jiang, Junjie Hu, Qiuyuan Huang, Lei Zhang, Jana Diesner,, Jianfeng Gao

PDF

Open Access 1 Repo

TL;DR

This paper introduces REO, a fine-grained evaluation method for image captioning that assesses relevance, extraness, and omission, providing more detailed insights than traditional single-score metrics.

Contribution

The study proposes a novel evaluation approach that offers detailed analysis of caption quality from multiple perspectives, improving correlation with human judgments.

Findings

01

REO correlates better with human judgments than traditional metrics.

02

REO provides more intuitive and detailed evaluation results.

03

Experiments on benchmark datasets validate the effectiveness of REO.

Abstract

Popular metrics used for evaluating image captioning systems, such as BLEU and CIDEr, provide a single score to gauge the system's overall effectiveness. This score is often not informative enough to indicate what specific errors are made by a given system. In this study, we present a fine-grained evaluation method REO for automatically measuring the performance of image captioning systems. REO assesses the quality of captions from three perspectives: 1) Relevance to the ground truth, 2) Extraness of the content that is irrelevant to the ground truth, and 3) Omission of the elements in the images and human references. Experiments on three benchmark datasets demonstrate that our method achieves a higher consistency with human judgments and provides more intuitive evaluation results than alternative metrics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SeleenaJM/CapEval
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Human Pose and Action Recognition