TableZa -- A classical Computer Vision approach to Tabular Extraction
Saumya Banthia, Anantha Sharma, Ravi Mangipudi

TL;DR
This paper presents a classical computer vision method called TableZa for extracting tabular data from images or PDFs, addressing the challenges of spectral and spatial data validation in document comprehension.
Contribution
It introduces a novel computer vision-based approach tailored for diverse tabular formats in documents, enhancing extraction accuracy.
Findings
Effective extraction of tabular data from images and PDFs.
Addresses spectral and spatial sanity in data extraction.
Applicable to various tabular formats.
Abstract
Computer aided Tabular Data Extraction has always been a very challenging and error prone task because it demands both Spectral and Spatial Sanity of data. In this paper we discuss an approach for Tabular Data Extraction in the realm of document comprehension. Given the different kinds of the Tabular formats that are often found across various documents, we discuss a novel approach using Computer Vision for extraction of tabular data from images or vector pdf(s) converted to image(s).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Processing and 3D Reconstruction · Currency Recognition and Detection
