Efficient Document Image Classification Using Region-Based Graph Neural Network
Jaya Krishna Mandivarapu, Eric Bunch, Qian You, Glenn Fung

TL;DR
This paper introduces an efficient document image classification framework using graph neural networks that leverages textual, visual, and layout information, achieving near state-of-the-art performance with significantly reduced computational resources.
Contribution
The paper presents a novel, resource-efficient GNN-based framework for document image classification that outperforms existing models in cost and speed while maintaining high accuracy.
Findings
Achieves near SOTA classification performance
Requires significantly less training and inference time
Offers better cost efficiency for scalable deployment
Abstract
Document image classification remains a popular research area because it can be commercialized in many enterprise applications across different industries. Recent advancements in large pre-trained computer vision and language models and graph neural networks has lent document image classification many tools. However using large pre-trained models usually requires substantial computing resources which could defeat the cost-saving advantages of automatic document image classification. In the paper we propose an efficient document image classification framework that uses graph convolution neural networks and incorporates textual, visual and layout information of the document. We have rigorously benchmarked our proposed algorithm against several state-of-art vision and language models on both publicly available dataset and a real-life insurance document classification dataset. Empirical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Topic Modeling · Text and Document Classification Technologies
MethodsGraph Convolutional Networks · Convolution
