CQE: A Comprehensive Quantity Extractor
Satya Almasian, Vivian Kazakova, Philip G\"oldner, Michael Gertz

TL;DR
This paper introduces CQE, a comprehensive framework for extracting, normalizing, and understanding quantities in text, including their values, units, behaviors, and associated concepts, outperforming existing methods.
Contribution
The paper presents a novel, open-source quantity extraction framework that detects complex quantity information and associated concepts using dependency parsing and a new dataset.
Findings
Outperforms existing quantity extraction systems
First to detect concepts associated with quantities
Effective normalization and standardization of quantities
Abstract
Quantities are essential in documents to describe factual information. They are ubiquitous in application domains such as finance, business, medicine, and science in general. Compared to other information extraction approaches, interestingly only a few works exist that describe methods for a proper extraction and representation of quantities in text. In this paper, we present such a comprehensive quantity extraction framework from text data. It efficiently detects combinations of values and units, the behavior of a quantity (e.g., rising or falling), and the concept a quantity is associated with. Our framework makes use of dependency parsing and a dictionary of units, and it provides for a proper normalization and standardization of detected quantities. Using a novel dataset for evaluation, we show that our open source framework outperforms other systems and -- to the best of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Semantic Web and Ontologies · Time Series Analysis and Forecasting
