Tablext: A Combined Neural Network And Heuristic Based Table Extractor
Zach Colter, Morteza Fayazi, Zineb Benameur-El, Serafina Kamp, Shuyan, Yu, Ronald Dreslinski

TL;DR
Tablext is a versatile table extraction tool combining neural networks and computer vision to accurately extract data from various table formats, including images, outperforming existing methods.
Contribution
The paper introduces a novel multi-stage table extractor that integrates CNNs, YOLO, and computer vision techniques to handle diverse table types without relying on machine-readable data.
Findings
Outperforms state-of-the-art on ICDAR 2013 dataset
Handles tables with non-machine-readable formats effectively
Achieves high accuracy across complex table layouts
Abstract
A significant portion of the data available today is found within tables. Therefore, it is necessary to use automated table extraction to obtain thorough results when data-mining. Today's popular state-of-the-art methods for table extraction struggle to adequately extract tables with machine-readable text and structural data. To make matters worse, many tables do not have machine-readable data, such as tables saved as images, making most extraction methods completely ineffective. In order to address these issues, a novel, general format table extractor tool, Tablext, is proposed. This tool uses a combination of computer vision techniques and machine learning methods to efficiently and effectively identify and extract data from tables. Tablext begins by using a custom Convolutional Neural Network (CNN) to identify and separate all potential tables. The identification process is optimized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Currency Recognition and Detection
MethodsYou Only Look Once
