The Benefits of Close-Domain Fine-Tuning for Table Detection in Document   Images

\'Angela Casado-Garc\'ia; C\'esar Dom\'inguez; J\'onathan Heras; and Eloy Mata; Vico Pascual

arXiv:1912.05846·cs.CV·December 13, 2019

The Benefits of Close-Domain Fine-Tuning for Table Detection in Document Images

\'Angela Casado-Garc\'ia, C\'esar Dom\'inguez, J\'onathan Heras, and Eloy Mata, Vico Pascual

PDF

1 Repo

TL;DR

This paper demonstrates that fine-tuning deep learning models for table detection using datasets of document images significantly improves accuracy compared to models fine-tuned from natural images, emphasizing the importance of domain closeness.

Contribution

The study shows that employing close-domain fine-tuning from document image datasets enhances table detection accuracy over traditional natural image pre-training.

Findings

01

Fine-tuning from document image datasets improves accuracy by up to 60%.

02

Models trained on TableBank outperform those fine-tuned from natural images.

03

Close-domain fine-tuning is more effective for table detection in document images.

Abstract

A correct localisation of tables in a document is instrumental for determining their structure and extracting their contents; therefore, table detection is a key step in table understanding. Nowadays, the most successful methods for table detection in document images employ deep learning algorithms; and, particularly, a technique known as fine-tuning. In this context, such a technique exports the knowledge acquired to detect objects in natural images to detect tables in document images. However, there is only a vague relation between natural and document images, and fine-tuning works better when there is a close relation between the source and target task. In this paper, we show that it is more beneficial to employ fine-tuning from a closer domain. To this aim, we train different object detection algorithms (namely, Mask R-CNN, RetinaNet, SSD and YOLO) using the TableBank dataset (a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

holms-ur/fine-tuning
mxnetOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsRegion Proposal Network · Focal Loss · Softmax · RoIAlign · Feature Pyramid Network · Convolution · RetinaNet · Non Maximum Suppression · Mask R-CNN · 1x1 Convolution