AltChart: Enhancing VLM-based Chart Summarization Through Multi-Pretext Tasks

Omar Moured; Jiaming Zhang; M. Saquib Sarfraz; Rainer Stiefelhagen

arXiv:2405.13580·cs.CV·March 31, 2026

AltChart: Enhancing VLM-based Chart Summarization Through Multi-Pretext Tasks

Omar Moured, Jiaming Zhang, M. Saquib Sarfraz, Rainer Stiefelhagen

PDF

1 Repo

TL;DR

This paper introduces AltChart, a large dataset and a multi-pretext task training method for improving chart summarization for visually impaired users, demonstrating enhanced model performance and accessibility.

Contribution

It presents a new dataset of 10,000 chart images with detailed summaries and a novel pretraining approach for vision-language models to improve chart understanding.

Findings

01

Achieved approximately 2.5% performance improvement with the new pretraining method.

02

Provided extensive evaluation of four chart summarization models.

03

Made dataset and code publicly available for research and development.

Abstract

Chart summarization is a crucial task for blind and visually impaired individuals as it is their primary means of accessing and interpreting graphical data. Crafting high-quality descriptions is challenging because it requires precise communication of essential details within the chart without vision perception. Many chart analysis methods, however, produce brief, unstructured responses that may contain significant hallucinations, affecting their reliability for blind people. To address these challenges, this work presents three key contributions: (1) We introduce the AltChart dataset, comprising 10,000 real chart images, each paired with a comprehensive summary that features long-context, and semantically rich annotations. (2) We propose a new method for pretraining Vision-Language Models (VLMs) to learn fine-grained chart representations through training with multiple pretext tasks,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

moured/AltChart
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.