Relations such as Hypernymy: Identifying and Exploiting Hearst Patterns in Distributional Vectors for Lexical Entailment
Stephen Roller, Katrin Erk

TL;DR
This paper analyzes how distributional vector models can predict lexical entailment, revealing the importance of Hearst patterns, and introduces a new model that leverages these patterns to improve performance.
Contribution
The paper provides a novel qualitative analysis of existing models, highlighting the role of Hearst patterns, and proposes an innovative feature extraction method that enhances lexical entailment prediction.
Findings
The model effectively identifies hypernyms using Hearst patterns.
The proposed method outperforms previous models on multiple datasets.
Combining features from Hearst patterns with other models improves accuracy.
Abstract
We consider the task of predicting lexical entailment using distributional vectors. We perform a novel qualitative analysis of one existing model which was previously shown to only measure the prototypicality of word pairs. We find that the model strongly learns to identify hypernyms using Hearst patterns, which are well known to be predictive of lexical relations. We present a novel model which exploits this behavior as a method of feature extraction in an iterative procedure similar to Principal Component Analysis. Our model combines the extracted features with the strengths of other proposed models in the literature, and matches or outperforms prior work on multiple data sets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
