Exploring Multi-Tasking Learning in Document Attribute Classification
Tanmoy Mondal, Abhijit Das, Zuheng Ming

TL;DR
This paper introduces a multi-task learning approach with a hybrid CNN architecture for classifying various document attributes from images, utilizing word-level and patch-level data, and employs an intelligent voting system for overall document classification.
Contribution
It presents a novel multi-task learning network, a combined MTL+MI CNN architecture, and an intelligent voting system for document attribute classification.
Findings
Effective classification of document attributes using word and patch data.
Improved accuracy through joint learning and voting mechanisms.
Demonstrated robustness across different document types.
Abstract
In this work, we adhere to explore a Multi-Tasking learning (MTL) based network to perform document attribute classification such as the font type, font size, font emphasis and scanning resolution classification of a document image. To accomplish these tasks, we operate on either segmented word level or on uniformed size patches randomly cropped out of the document. Furthermore, a hybrid convolution neural network (CNN) architecture "MTL+MI", which is based on the combination of MTL and Multi-Instance (MI) of patch and word is used to accomplish joint learning for the classification of the same document attributes. The contribution of this paper are three fold: firstly, based on segmented word images and patches, we present a MTL based network for the classification of a full document image. Secondly, we propose a MTL and MI (using segmented words and patches) based combined CNN…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Digital Media Forensic Detection · Image Retrieval and Classification Techniques
MethodsConvolution
