Exemplar-Based Word Sense Disambiguation: Some Recent Improvements

Hwee Tou Ng (DSO National Laboratories)

arXiv:cmp-lg/9706010·cmp-lg·February 3, 2008·67 cites

Exemplar-Based Word Sense Disambiguation: Some Recent Improvements

Hwee Tou Ng (DSO National Laboratories)

PDF

Open Access

TL;DR

This paper presents recent enhancements to exemplar-based word sense disambiguation, notably using larger k values and cross-validation, achieving accuracy comparable to the best existing methods on a large sense-tagged corpus.

Contribution

The paper introduces improvements to exemplar-based disambiguation by optimizing k and using cross-validation, resulting in higher accuracy on a large corpus.

Findings

01

Achieved higher disambiguation accuracy with larger k values.

02

Automatic k selection via cross-validation improves performance.

03

Comparable accuracy to the Naive-Bayes algorithm on the same dataset.

Abstract

In this paper, we report recent improvements to the exemplar-based learning approach for word sense disambiguation that have achieved higher disambiguation accuracy. By using a larger value of $k$ , the number of nearest neighbors to use for determining the class of a test example, and through 10-fold cross validation to automatically determine the best $k$ , we have obtained improved disambiguation accuracy on a large sense-tagged corpus first used in \cite{ng96}. The accuracy achieved by our improved exemplar-based classifier is comparable to the accuracy on the same data set obtained by the Naive-Bayes algorithm, which was reported in \cite{mooney96} to have the highest disambiguation accuracy among seven state-of-the-art machine learning algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems