TNCR: Table Net Detection and Classification Dataset

Abdelrahman Abdallah; Alexander Berendeyev; Islam Nuradin; Daniyar; Nurseitov

arXiv:2106.15322·cs.CV·December 30, 2021

TNCR: Table Net Detection and Classification Dataset

Abdelrahman Abdallah, Alexander Berendeyev, Islam Nuradin, Daniyar, Nurseitov

PDF

1 Repo

TL;DR

This paper introduces TNCR, a comprehensive dataset for table detection and classification in document images, along with strong baseline results using deep learning methods to advance research in this area.

Contribution

The paper provides a new, publicly available dataset with diverse images and benchmarks state-of-the-art deep learning models for table detection and classification.

Findings

01

Cascade Mask R-CNN achieved 79.7% precision

02

Recall was 89.8%, and F1 score was 84.4%

03

TNCR dataset is open source and ready for research use

Abstract

We present TNCR, a new table dataset with varying image quality collected from free websites. The TNCR dataset can be used for table detection in scanned document images and their classification into 5 different classes. TNCR contains 9428 high-quality labeled images. In this paper, we have implemented state-of-the-art deep learning-based methods for table detection to create several strong baselines. Cascade Mask R-CNN with ResNeXt-101-64x4d Backbone Network achieves the highest performance compared to other methods with a precision of 79.7%, recall of 89.8%, and f1 score of 84.4% on the TNCR dataset. We have made TNCR open source in the hope of encouraging more deep learning approaches to table detection, classification, and structure recognition. The dataset and trained model checkpoints are available at https://github.com/abdoelsayed2016/TNCR_Dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

abdoelsayed2016/TNCR_Dataset
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsRegion Proposal Network · Convolution · Softmax · Cascade Mask R-CNN · RoIAlign · Mask R-CNN