Part & Whole Extraction: Towards A Deep Understanding of Quantitative Facts for Percentages in Text
Lei Fang, Jian-Guang Lou

TL;DR
This paper presents a novel sequence tagging approach with a skip mechanism to extract quantitative part-whole facts related to percentages in text, enhancing understanding for applications like infographic generation.
Contribution
It introduces a skip-based sequence tagging model specifically designed for extracting part and whole entities associated with percentages, improving accuracy over existing methods.
Findings
Effective in extracting part-whole relationships for percentages
Improved performance on sequence tagging and NER tasks
Skip mechanism enhances extraction accuracy
Abstract
We study the problem of quantitative facts extraction for text with percentages. For example, given the sentence "30 percent of Americans like watching football, while 20% prefer to watch NBA.", our goal is to obtain a deep understanding of the percentage numbers ("30 percent" and "20%") by extracting their quantitative facts: part ("like watching football" and "prefer to watch NBA") and whole ("Americans). These quantitative facts can empower new applications like automated infographic generation. We formulate part and whole extraction as a sequence tagging problem. Due to the large gap between part/whole and its corresponding percentage, we introduce skip mechanism in sequence modeling, and achieved improved performance on both our task and the CoNLL-2003 named entity recognition task. Experimental results demonstrate that learning to skip in sequence tagging is promising.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
