Enriching Knowledge Bases with Counting Quantifiers
Paramita Mirza, Simon Razniewski, Fariz Darari, Gerhard, Weikum

TL;DR
This paper introduces CINEX, a system for extracting counting information from text to enhance knowledge bases, addressing challenges like incomplete data and linguistic diversity, and demonstrating significant enrichment of Wikidata.
Contribution
The paper presents the first comprehensive system for extracting counting quantifiers from text, improving knowledge base enrichment through novel techniques and large-scale experiments.
Findings
Achieves 60% average precision in extracting counting information.
Enriches Wikidata with 2.5 million new facts for 110 relations.
Demonstrates the effectiveness of the approach in real-world knowledge base expansion.
Abstract
Information extraction traditionally focuses on extracting relations between identifiable entities, such as <Monterey, locatedIn, California>. Yet, texts often also contain Counting information, stating that a subject is in a specific relation with a number of objects, without mentioning the objects themselves, for example, "California is divided into 58 counties". Such counting quantifiers can help in a variety of tasks such as query answering or knowledge base curation, but are neglected by prior work. This paper develops the first full-fledged system for extracting counting information from text, called CINEX. We employ distant supervision using fact counts from a knowledge base as training seeds, and develop novel techniques for dealing with several challenges: (i) non-maximal training seeds due to the incompleteness of knowledge bases, (ii) sparse and skewed observations in text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies
