Large-Scale Noun Compound Interpretation Using Bootstrapping and the Web as a Corpus
Su Nam Kim, Preslav Nakov

TL;DR
This paper presents a bootstrapping method leveraging web data to automatically interpret noun compounds with fine-grained semantic relations, enhancing semantic resource creation for NLP.
Contribution
It introduces a novel bootstrapping approach that jointly extracts noun compounds and their semantic interpretations using web statistics, improving accuracy and coverage.
Findings
Higher number of interpreted NCs with fixed compounds
Improved accuracy due to semantic restrictions
Effective use of web data for semantic extraction
Abstract
Responding to the need for semantic lexical resources in natural language processing applications, we examine methods to acquire noun compounds (NCs), e.g., "orange juice", together with suitable fine-grained semantic interpretations, e.g., "squeezed from", which are directly usable as paraphrases. We employ bootstrapping and web statistics, and utilize the relationship between NCs and paraphrasing patterns to jointly extract NCs and such patterns in multiple alternating iterations. In evaluation, we found that having one compound noun fixed yields both a higher number of semantically interpreted NCs and improved accuracy due to stronger semantic restrictions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
