Doc2Chart: Intent-Driven Zero-Shot Chart Generation from Documents
Akriti Jain, Pritika Ramu, Aparna Garimella, Apoorv Saxena

TL;DR
This paper introduces Doc2Chart, a zero-shot framework for generating data visualizations from long documents based on user intents, using an unsupervised, two-stage process with an attribution-based accuracy metric.
Contribution
It presents a novel intent-driven, zero-shot chart generation method from documents, including a new dataset and an attribution-based evaluation metric.
Findings
Outperforms baselines in chart data accuracy by up to 9 points.
Achieves up to 17 points better in chart type accuracy.
Validates effectiveness across finance and scientific domains.
Abstract
Large Language Models (LLMs) have demonstrated strong capabilities in transforming text descriptions or tables to data visualizations via instruction-tuning methods. However, it is not straightforward to apply these methods directly for a more real-world use case of visualizing data from long documents based on user-given intents, as opposed to the user pre-selecting the relevant content manually. We introduce the task of intent-based chart generation from documents: given a user-specified intent and document(s), the goal is to generate a chart adhering to the intent and grounded on the document(s) in a zero-shot setting. We propose an unsupervised, two-staged framework in which an LLM first extracts relevant information from the document(s) by decomposing the intent and iteratively validates and refines this data. Next, a heuristic-guided module selects an appropriate chart type before…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsData Visualization and Analytics · Handwritten Text Recognition Techniques · Computational and Text Analysis Methods
