Large-Scale Noun Compound Interpretation Using Bootstrapping and the Web   as a Corpus

Su Nam Kim; Preslav Nakov

arXiv:1911.12085·cs.CL·November 28, 2019·33 cites

Large-Scale Noun Compound Interpretation Using Bootstrapping and the Web as a Corpus

Su Nam Kim, Preslav Nakov

PDF

Open Access

TL;DR

This paper presents a bootstrapping method leveraging web data to automatically interpret noun compounds with fine-grained semantic relations, enhancing semantic resource creation for NLP.

Contribution

It introduces a novel bootstrapping approach that jointly extracts noun compounds and their semantic interpretations using web statistics, improving accuracy and coverage.

Findings

01

Higher number of interpreted NCs with fixed compounds

02

Improved accuracy due to semantic restrictions

03

Effective use of web data for semantic extraction

Abstract

Responding to the need for semantic lexical resources in natural language processing applications, we examine methods to acquire noun compounds (NCs), e.g., "orange juice", together with suitable fine-grained semantic interpretations, e.g., "squeezed from", which are directly usable as paraphrases. We employ bootstrapping and web statistics, and utilize the relationship between NCs and paraphrasing patterns to jointly extract NCs and such patterns in multiple alternating iterations. In evaluation, we found that having one compound noun fixed yields both a higher number of semantically interpreted NCs and improved accuracy due to stronger semantic restrictions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems