Deep Reader: Information extraction from Document images via relation   extraction and Natural Language

Vishwanath D; Rohit Rahul; Gunjan Sehgal; Swati; Arindam Chowdhury,; Monika Sharma; Lovekesh Vig; Gautam Shroff; and Ashwin Srinivasan

arXiv:1812.04377·cs.CV·December 17, 2018·1 cites

Deep Reader: Information extraction from Document images via relation extraction and Natural Language

Vishwanath D, Rohit Rahul, Gunjan Sehgal, Swati, Arindam Chowdhury,, Monika Sharma, Lovekesh Vig, Gautam Shroff, and Ashwin Srinivasan

PDF

Open Access

TL;DR

DeepReader is an end-to-end framework that combines advanced vision algorithms and relational modeling to extract structured information from document images, enabling natural language querying for non-technical users.

Contribution

It introduces a novel integrated system that captures visual entities and their relationships in documents, facilitating easy information retrieval through natural language queries.

Findings

01

Effective recognition of handwritten and printed text.

02

Successful extraction of visual entities like tables and boxes.

03

Natural language interface enables non-technical user queries.

Abstract

Recent advancements in the area of Computer Vision with state-of-art Neural Networks has given a boost to Optical Character Recognition (OCR) accuracies. However, extracting characters/text alone is often insufficient for relevant information extraction as documents also have a visual structure that is not captured by OCR. Extracting information from tables, charts, footnotes, boxes, headings and retrieving the corresponding structured representation for the document remains a challenge and finds application in a large number of real-world use cases. In this paper, we propose a novel enterprise based end-to-end framework called DeepReader which facilitates information extraction from document images via identification of visual entities and populating a meta relational model across different entities in the document image. The model schema allows for an easy to understand abstraction of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Image Retrieval and Classification Techniques · Image Processing and 3D Reconstruction