Enhancing Question Answering on Charts Through Effective Pre-training Tasks
Ashim Gupta, Vivek Gupta, Shuo Zhang, Yujie He, Ning Zhang, Shalin, Shah

TL;DR
This paper improves chart question answering by analyzing current models' limitations and introducing three pre-training tasks that enhance understanding of visual, structural, and numerical aspects, leading to better performance.
Contribution
It proposes three novel pre-training tasks to enhance visual and structural understanding in chart question answering models, addressing key limitations of existing approaches.
Findings
Achieved 1.7% average improvement over baseline models
Identified key weaknesses in current models' understanding of chart structure and numerical data
Validated effectiveness across multiple chart datasets
Abstract
To completely understand a document, the use of textual information is not enough. Understanding visual cues, such as layouts and charts, is also required. While the current state-of-the-art approaches for document understanding (both OCR-based and OCR-free) work well, a thorough analysis of their capabilities and limitations has not yet been performed. Therefore, in this work, we addresses the limitation of current VisualQA models when applied to charts and plots. To investigate shortcomings of the state-of-the-art models, we conduct a comprehensive behavioral analysis, using ChartQA as a case study. Our findings indicate that existing models particularly underperform in answering questions related to the chart's structural and visual context, as well as numerical information. To address these issues, we propose three simple pre-training tasks that enforce the existing model in terms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Information Retrieval and Search Behavior · Advanced Text Analysis Techniques
