The Uned systems at Senseval-2
David Fernandez-Amoros, Julio Gonzalo, Felisa Verdejo

TL;DR
This paper presents the Uned systems at Senseval-2, an unsupervised and supervised approach to word sense disambiguation using mutual information, achieving top scores among unsupervised methods but highlighting ongoing challenges.
Contribution
The paper introduces a novel unsupervised system based on mutual information for word sense disambiguation, with a supervised extension, and evaluates its performance on Senseval-2 tasks.
Findings
Unsupervised system scored 56.9% recall in all words
Unsupervised system scored 40.2% in lexical sample
System outperformed other unsupervised approaches
Abstract
We have participated in the SENSEVAL-2 English tasks (all words and lexical sample) with an unsupervised system based on mutual information measured over a large corpus (277 million words) and some additional heuristics. A supervised extension of the system was also presented to the lexical sample task. Our system scored first among unsupervised systems in both tasks: 56.9% recall in all words, 40.2% in lexical sample. This is slightly worse than the first sense heuristic for all words and 3.6% better for the lexical sample, a strong indication that unsupervised Word Sense Disambiguation remains being a strong challenge.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
