Deep Reader: Information extraction from Document images via relation extraction and Natural Language
Vishwanath D, Rohit Rahul, Gunjan Sehgal, Swati, Arindam Chowdhury,, Monika Sharma, Lovekesh Vig, Gautam Shroff, and Ashwin Srinivasan

TL;DR
DeepReader is an end-to-end framework that combines advanced vision algorithms and relational modeling to extract structured information from document images, enabling natural language querying for non-technical users.
Contribution
It introduces a novel integrated system that captures visual entities and their relationships in documents, facilitating easy information retrieval through natural language queries.
Findings
Effective recognition of handwritten and printed text.
Successful extraction of visual entities like tables and boxes.
Natural language interface enables non-technical user queries.
Abstract
Recent advancements in the area of Computer Vision with state-of-art Neural Networks has given a boost to Optical Character Recognition (OCR) accuracies. However, extracting characters/text alone is often insufficient for relevant information extraction as documents also have a visual structure that is not captured by OCR. Extracting information from tables, charts, footnotes, boxes, headings and retrieving the corresponding structured representation for the document remains a challenge and finds application in a large number of real-world use cases. In this paper, we propose a novel enterprise based end-to-end framework called DeepReader which facilitates information extraction from document images via identification of visual entities and populating a meta relational model across different entities in the document image. The model schema allows for an easy to understand abstraction of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Retrieval and Classification Techniques · Image Processing and 3D Reconstruction
