Flexible Table Recognition and Semantic Interpretation System
Marcin Namysl, Alexander M. Esser, Sven Behnke, Joachim K\"ohler

TL;DR
This paper presents a flexible, modular system for table recognition and semantic interpretation that combines rule-based algorithms with graph-based interpretation, achieving competitive results on challenging benchmarks.
Contribution
It introduces a novel combination of rule-based recognition and graph-based semantic interpretation for flexible table extraction.
Findings
Achieved a high F1 score of 0.7380 on ICDAR benchmarks.
Developed two complete rule-based algorithms for table detection and segmentation.
Made resources publicly available to support future research.
Abstract
Table extraction is an important but still unsolved problem. In this paper, we introduce a flexible and modular table extraction system. We develop two rule-based algorithms that perform the complete table recognition process, including table detection and segmentation, and support the most frequent table formats. Moreover, to incorporate the extraction of semantic information, we develop a graph-based table interpretation method. We conduct extensive experiments on the challenging table recognition benchmarks ICDAR 2013 and ICDAR 2019, achieving results competitive with state-of-the-art approaches. Our complete information extraction system exhibited a high F1 score of 0.7380. To support future research on information extraction from documents, we make the resources (ground-truth annotations, evaluation scripts, algorithm parameters) from our table interpretation experiment publicly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Text and Document Classification Technologies · Data Quality and Management
