Faithful Chart Summarization with ChaTS-Pi
Syrine Krichene, Francesco Piccinno, Fangyu Liu, Julian Martin, Eisenschlos

TL;DR
This paper introduces CHATS-CRITIC, a new reference-free metric for assessing the faithfulness of chart summaries, and CHATS-PI, a pipeline that improves chart-to-summary generation by fixing and ranking candidates, achieving state-of-the-art results.
Contribution
The work presents a novel reference-free metric for chart summary faithfulness and a pipeline that leverages this metric to enhance chart-to-summary generation.
Findings
CHATS-CRITIC correlates better with human ratings than existing metrics.
CHATS-PI improves the quality of chart summaries by fixing unsupported sentences.
State-of-the-art performance on two chart-to-summary datasets.
Abstract
Chart-to-summary generation can help explore data, communicate insights, and help the visually impaired people. Multi-modal generative models have been used to produce fluent summaries, but they can suffer from factual and perceptual errors. In this work we present CHATS-CRITIC, a reference-free chart summarization metric for scoring faithfulness. CHATS-CRITIC is composed of an image-to-text model to recover the table from a chart, and a tabular entailment model applied to score the summary sentence by sentence. We find that CHATS-CRITIC evaluates the summary quality according to human ratings better than reference-based metrics, either learned or n-gram based, and can be further used to fix candidate summaries by removing not supported sentences. We then introduce CHATS-PI, a chart-to-summary pipeline that leverages CHATS-CRITIC during inference to fix and rank sampled candidates from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Data Quality and Management · Advanced Text Analysis Techniques
