CHARTER: heatmap-based multi-type chart data extraction
Joseph Shtok, Sivan Harary, Ophir Azulai, Adi Raz Goldfarb, Assaf, Arbelle, Leonid Karlinsky

TL;DR
This paper introduces CHARTER, a neural network-based system that accurately extracts structured data from various types of charts in documents, using heatmap predictions trained on synthetic data.
Contribution
The method employs heatmap-based detection for precise chart element identification, overcoming bounding-box limitations and enabling end-to-end chart data extraction.
Findings
High robustness and precision demonstrated on benchmarks
Effective detection of pie, line, and scatter plots
Eliminates need for real training data through synthetic data use
Abstract
The digital conversion of information stored in documents is a great source of knowledge. In contrast to the documents text, the conversion of the embedded documents graphics, such as charts and plots, has been much less explored. We present a method and a system for end-to-end conversion of document charts into machine readable tabular data format, which can be easily stored and analyzed in the digital domain. Our approach extracts and analyses charts along with their graphical elements and supporting structures such as legends, axes, titles, and captions. Our detection system is based on neural networks, trained solely on synthetic data, eliminating the limiting factor of data collection. As opposed to previous methods, which detect graphical elements using bounding-boxes, our networks feature auxiliary domain specific heatmaps prediction enabling the precise detection of pie charts,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Video Analysis and Summarization · Digital Media Forensic Detection
