FabricQA-Extractor: A Question Answering System to Extract Information from Documents using Natural Language Questions
Qiming Wang, Raul Castro Fernandez

TL;DR
FabricQA-Extractor is a question answering system that leverages relational structure knowledge to improve large-scale information extraction from unstructured documents.
Contribution
The paper introduces Relation Coherence, a novel model that exploits relational structure, integrated into FabricQA-Extractor for enhanced extraction performance.
Findings
Relation Coherence improves extraction accuracy.
FabricQA-Extractor performs well on large-scale datasets.
System efficiently handles millions of documents.
Abstract
Reading comprehension models answer questions posed in natural language when provided with a short passage of text. They present an opportunity to address a long-standing challenge in data management: the extraction of structured data from unstructured text. Consequently, several approaches are using these models to perform information extraction. However, these modern approaches leave an opportunity behind because they do not exploit the relational structure of the target extraction table. In this paper, we introduce a new model, Relation Coherence, that exploits knowledge of the relational structure to improve the extraction quality. We incorporate the Relation Coherence model as part of FabricQA-Extractor, an end-to-end system we built from scratch to conduct large scale extraction tasks over millions of documents. We demonstrate on two datasets with millions of passages that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
