More than just Frequency? Demasking Unsupervised Hypernymy Prediction   Methods

Thomas Bott; Dominik Schlechtweg; Sabine Schulte im Walde

arXiv:2106.00055·cs.CL·June 2, 2021

More than just Frequency? Demasking Unsupervised Hypernymy Prediction Methods

Thomas Bott, Dominik Schlechtweg, Sabine Schulte im Walde

PDF

1 Repo

TL;DR

This study compares unsupervised hypernymy prediction methods and finds that most are heavily influenced by word frequency, with some methods providing complementary insights despite lower overall accuracy.

Contribution

It reveals the extent of frequency bias in existing hypernymy prediction methods and highlights the importance of checking for frequency effects in such models.

Findings

01

Most methods' predictions are highly correlated with frequency-based predictions.

02

SLQS makes correct predictions where other methods fail.

03

Frequency bias is a significant factor in hypernymy prediction methods.

Abstract

This paper presents a comparison of unsupervised methods of hypernymy prediction (i.e., to predict which word in a pair of words such as fish-cod is the hypernym and which the hyponym). Most importantly, we demonstrate across datasets for English and for German that the predictions of three methods (WeedsPrec, invCL, SLQS Row) strongly overlap and are highly correlated with frequency-based predictions. In contrast, the second-order method SLQS shows an overall lower accuracy but makes correct predictions where the others go wrong. Our study once more confirms the general need to check the frequency bias of a computational method in order to identify frequency-(un)related effects.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Thommy96/hyp-freq-comp
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.