The Uned systems at Senseval-2

David Fernandez-Amoros; Julio Gonzalo; Felisa Verdejo

arXiv:0910.5410·cs.CL·October 29, 2009·1 cites

The Uned systems at Senseval-2

David Fernandez-Amoros, Julio Gonzalo, Felisa Verdejo

PDF

Open Access

TL;DR

This paper presents the Uned systems at Senseval-2, an unsupervised and supervised approach to word sense disambiguation using mutual information, achieving top scores among unsupervised methods but highlighting ongoing challenges.

Contribution

The paper introduces a novel unsupervised system based on mutual information for word sense disambiguation, with a supervised extension, and evaluates its performance on Senseval-2 tasks.

Findings

01

Unsupervised system scored 56.9% recall in all words

02

Unsupervised system scored 40.2% in lexical sample

03

System outperformed other unsupervised approaches

Abstract

We have participated in the SENSEVAL-2 English tasks (all words and lexical sample) with an unsupervised system based on mutual information measured over a large corpus (277 million words) and some additional heuristics. A supervised extension of the system was also presented to the lexical sample task. Our system scored first among unsupervised systems in both tasks: 56.9% recall in all words, 40.2% in lexical sample. This is slightly worse than the first sense heuristic for all words and 3.6% better for the lexical sample, a strong indication that unsupervised Word Sense Disambiguation remains being a strong challenge.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification