TabAug: Data Driven Augmentation for Enhanced Table Structure   Recognition

Umar Khan; Sohaib Zahid; Muhammad Asad Ali; Adnan ul Hassan; Faisal; Shafait

arXiv:2104.14237·cs.CV·May 18, 2021

TabAug: Data Driven Augmentation for Enhanced Table Structure Recognition

Umar Khan, Sohaib Zahid, Muhammad Asad Ali, Adnan ul Hassan, Faisal, Shafait

PDF

1 Repo

TL;DR

TabAug introduces a novel data augmentation method that creates structural variations in table images, significantly improving table structure recognition accuracy in scenarios with limited labeled data.

Contribution

The paper presents TabAug, a data-driven augmentation technique that manipulates table structures through row and column replication and deletion, enhancing deep learning model performance.

Findings

01

Cell-level detection accuracy improved from 92.16% to 96.11%.

02

Consistent improvements across all evaluation metrics on ICDAR 2013.

03

Structural augmentation outperforms traditional image-based augmentation techniques.

Abstract

Table Structure Recognition is an essential part of end-to-end tabular data extraction in document images. The recent success of deep learning model architectures in computer vision remains to be non-reflective in table structure recognition, largely because extensive datasets for this domain are still unavailable while labeling new data is expensive and time-consuming. Traditionally, in computer vision, these challenges are addressed by standard augmentation techniques that are based on image transformations like color jittering and random cropping. As demonstrated by our experiments, these techniques are not effective for the task of table structure recognition. In this paper, we propose TabAug, a re-imagined Data Augmentation technique that produces structural changes in table images through replication and deletion of rows and columns. It also consists of a data-driven probabilistic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sohaib023/splerge-tab-aug
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.