Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in   Chart Captioning

Kung-Hsiang Huang; Mingyang Zhou; Hou Pong Chan; Yi R. Fung,; Zhenhailong Wang; Lingyu Zhang; Shih-Fu Chang; Heng Ji

arXiv:2312.10160·cs.CL·May 31, 2024·1 cites

Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning

Kung-Hsiang Huang, Mingyang Zhou, Hou Pong Chan, Yi R. Fung,, Zhenhailong Wang, Lingyu Zhang, Shih-Fu Chang, Heng Ji

PDF

Open Access 3 Repos 2 Models 3 Datasets 1 Video

TL;DR

This paper investigates factual inaccuracies in chart captioning by analyzing error patterns, creating a dataset, and proposing models for factual error detection and correction to improve reliability in visual data descriptions.

Contribution

It introduces a comprehensive error typology, a new dataset CHOCOLATE, and novel models for factual error detection and correction in chart captioning.

Findings

01

State-of-the-art models often produce factual errors in captions.

02

The proposed C2TFEC framework effectively corrects factual inaccuracies.

03

CHARTVE outperforms existing models in factual evaluation.

Abstract

Recent advancements in large vision-language models (LVLMs) have led to significant progress in generating natural language descriptions for visual content and thus enhancing various applications. One issue with these powerful models is that they sometimes produce texts that are factually inconsistent with the visual input. While there has been some effort to mitigate such inconsistencies in natural image captioning, the factuality of generated captions for structured document images, such as charts, has not received as much scrutiny, posing a potential threat to information reliability in critical applications. This work delves into the factuality aspect by introducing a comprehensive typology of factual errors in generated chart captions. A large-scale human annotation effort provides insight into the error patterns and frequencies in captions crafted by various chart captioning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning· underline

Taxonomy

TopicsMultimodal Machine Learning Applications · Subtitles and Audiovisual Media · Natural Language Processing Techniques