A Survey of Deep Learning Approaches for OCR and Document Understanding
Nishant Subramani, Alexandre Matton, Malcolm Greaves, Adrian, Lam

TL;DR
This survey reviews deep learning methods for automatic document understanding, covering NLP and computer vision techniques applied to English documents like invoices and resumes, highlighting recent progress and research directions.
Contribution
It consolidates existing methodologies in deep learning for document understanding, providing a comprehensive overview for future research exploration.
Findings
Deep learning significantly advances document understanding capabilities.
Various techniques are effective for different document types and languages.
The survey identifies key challenges and future research directions.
Abstract
Documents are a core part of many businesses in many fields such as law, finance, and technology among others. Automatic understanding of documents such as invoices, contracts, and resumes is lucrative, opening up many new avenues of business. The fields of natural language processing and computer vision have seen tremendous progress through the development of deep learning such that these methods have started to become infused in contemporary document understanding systems. In this survey paper, we review different techniques for document understanding for documents written in English and consolidate methodologies present in literature to act as a jumping-off point for researchers exploring this area.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Image Processing and 3D Reconstruction
