Towards Efficient Quantity Retrieval from Text:An Approach via Description Parsing and Weak Supervision
Yixuan Cao, Zhengrong Chen, Chengxuan Xia, Kun Wu, Ping Luo

TL;DR
This paper presents a novel framework for retrieving quantitative facts from unstructured text by parsing descriptions into structured data and leveraging weak supervision, significantly enhancing retrieval accuracy in financial documents.
Contribution
The paper introduces a description parsing approach combined with weak supervision to improve quantity retrieval from unstructured text, especially in financial reports.
Findings
Top-1 retrieval accuracy improved from 30.98% to 64.66%.
Constructed a large paraphrase dataset using weak supervision.
Effective retrieval of quantitative facts from unstructured documents.
Abstract
Quantitative facts are continually generated by companies and governments, supporting data-driven decision-making. While common facts are structured, many long-tail quantitative facts remain buried in unstructured documents, making them difficult to access. We propose the task of Quantity Retrieval: given a description of a quantitative fact, the system returns the relevant value and supporting evidence. Understanding quantity semantics in context is essential for this task. We introduce a framework based on description parsing that converts text into structured (description, quantity) pairs for effective retrieval. To improve learning, we construct a large paraphrase dataset using weak supervision based on quantity co-occurrence. We evaluate our approach on a large corpus of financial annual reports and a newly annotated quantity description dataset. Our method significantly improves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Stock Market Forecasting Methods · Topic Modeling
