Document Image Classification with Intra-Domain Transfer Learning and   Stacked Generalization of Deep Convolutional Neural Networks

Arindam Das; Saikat Roy; Ujjwal Bhattacharya; Swapan Kumar Parui

arXiv:1801.09321·cs.CV·September 3, 2018

Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks

Arindam Das, Saikat Roy, Ujjwal Bhattacharya, Swapan Kumar Parui

PDF

4 Repos

TL;DR

This paper introduces a region-based deep learning framework for document image classification that leverages intra-domain transfer learning and stacked generalization, achieving state-of-the-art accuracy on the RVL-CDIP dataset.

Contribution

It presents a novel combination of intra-domain transfer learning and stacked ensemble methods for improved document image classification.

Findings

01

Achieved 92.2% accuracy on RVL-CDIP dataset.

02

Outperformed existing benchmark algorithms.

03

Demonstrated effective transfer learning within document domains.

Abstract

In this work, a region-based Deep Convolutional Neural Network framework is proposed for document structure learning. The contribution of this work involves efficient training of region based classifiers and effective ensembling for document image classification. A primary level of `inter-domain' transfer learning is used by exporting weights from a pre-trained VGG16 architecture on the ImageNet dataset to train a document classifier on whole document images. Exploiting the nature of region based influence modelling, a secondary level of `intra-domain' transfer learning is used for rapid training of deep learning models for image segments. Finally, stacked generalization based ensembling is utilized for combining the predictions of the base deep neural network models. The proposed method achieves state-of-the-art accuracy of 92.2% on the popular RVL-CDIP document image dataset,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.