A Survey of Deep Learning Approaches for OCR and Document Understanding

Nishant Subramani; Alexandre Matton; Malcolm Greaves; Adrian; Lam

arXiv:2011.13534·cs.CL·February 8, 2021·26 cites

A Survey of Deep Learning Approaches for OCR and Document Understanding

Nishant Subramani, Alexandre Matton, Malcolm Greaves, Adrian, Lam

PDF

Open Access 1 Repo

TL;DR

This survey reviews deep learning methods for automatic document understanding, covering NLP and computer vision techniques applied to English documents like invoices and resumes, highlighting recent progress and research directions.

Contribution

It consolidates existing methodologies in deep learning for document understanding, providing a comprehensive overview for future research exploration.

Findings

01

Deep learning significantly advances document understanding capabilities.

02

Various techniques are effective for different document types and languages.

03

The survey identifies key challenges and future research directions.

Abstract

Documents are a core part of many businesses in many fields such as law, finance, and technology among others. Automatic understanding of documents such as invoices, contracts, and resumes is lucrative, opening up many new avenues of business. The fields of natural language processing and computer vision have seen tremendous progress through the development of deep learning such that these methods have started to become infused in contemporary document understanding systems. In this survey paper, we review different techniques for document understanding for documents written in English and consolidate methodologies present in literature to act as a jumping-off point for researchers exploring this area.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RajArPatra/Super-OCR
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Image Processing and 3D Reconstruction