Homonymy Information for English WordNet
Rowan Hall Maudslay, Simone Teufel

TL;DR
This paper introduces a novel method for annotating homonymy in WordNet by linking it to the Oxford English Dictionary using transformer-based embeddings, achieving high accuracy and providing a valuable resource.
Contribution
It presents a new approach that leverages language models to automatically annotate homonymy in WordNet through dictionary alignment, improving over previous clustering methods.
Findings
Achieved an F1 score of 0.97 on evaluation data.
Produced a high-quality homonymy annotation layer for WordNet.
Demonstrated the effectiveness of embedding-based dictionary alignment.
Abstract
A widely acknowledged shortcoming of WordNet is that it lacks a distinction between word meanings which are systematically related (polysemy), and those which are coincidental (homonymy). Several previous works have attempted to fill this gap, by inferring this information using computational methods. We revisit this task, and exploit recent advances in language modelling to synthesise homonymy annotation for Princeton WordNet. Previous approaches treat the problem using clustering methods; by contrast, our method works by linking WordNet to the Oxford English Dictionary, which contains the information we need. To perform this alignment, we pair definitions based on their proximity in an embedding space produced by a Transformer model. Despite the simplicity of this approach, our best model attains an F1 of .97 on an evaluation set that we annotate. The outcome of our work is a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Biomedical Text Mining and Ontologies
MethodsMulti-Head Attention · Attention Is All You Need · Dropout · Linear Layer · Byte Pair Encoding · Absolute Position Encodings · Dense Connections · Residual Connection · Label Smoothing · Adam
