Improving accuracy and speeding up Document Image Classification through   parallel systems

Javier Ferrando; Juan Luis Dominguez; Jordi Torres; Raul; Garcia; David Garcia; Daniel Garrido; Jordi Cortada; Mateo Valero

arXiv:2006.09141·cs.CV·June 17, 2020

Improving accuracy and speeding up Document Image Classification through parallel systems

Javier Ferrando, Juan Luis Dominguez, Jordi Torres, Raul, Garcia, David Garcia, Daniel Garrido, Jordi Cortada, Mateo Valero

PDF

1 Repo

TL;DR

This study demonstrates that using EfficientNet models and ensemble techniques can improve document image classification accuracy while reducing computational costs, and highlights the benefits of parallel training across multiple GPUs.

Contribution

The paper introduces a lightweight EfficientNet-based model, an ensemble pipeline combining image and text models, and analyzes parallel training efficiency across frameworks.

Findings

01

EfficientNet outperforms heavier CNNs in document classification.

02

Ensemble of image and text models boosts accuracy.

03

Parallel training with larger batch sizes reduces training time.

Abstract

This paper presents a study showing the benefits of the EfficientNet models compared with heavier Convolutional Neural Networks (CNNs) in the Document Classification task, essential problem in the digitalization process of institutions. We show in the RVL-CDIP dataset that we can improve previous results with a much lighter model and present its transfer learning capabilities on a smaller in-domain dataset such as Tobacco3482. Moreover, we present an ensemble pipeline which is able to boost solely image input by combining image model predictions with the ones generated by BERT model on extracted text by OCR. We also show that the batch size can be effectively increased without hindering its accuracy so that the training process can be sped up by parallelizing throughout multiple GPUs, decreasing the computational time needed. Lastly, we expose the training performance differences…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

javiferran/document-classification
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · RMSProp · Pointwise Convolution · Depthwise Convolution · Weight Decay · Softmax · *Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Adam · Multi-Head Attention