Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach
Hwee Tou Ng, Hian Beng Lee (Defence Science Organisation)

TL;DR
This paper introduces Lexas, an exemplar-based word sense disambiguation method that combines multiple knowledge sources, achieving higher accuracy than previous approaches on standard and large sense-tagged corpora.
Contribution
The paper presents a novel exemplar-based WSD approach that integrates diverse knowledge sources, improving disambiguation accuracy over existing methods.
Findings
Lexas outperforms previous methods on standard datasets.
Lexas achieves better accuracy on highly ambiguous words.
The approach effectively combines multiple knowledge sources.
Abstract
In this paper, we present a new approach for word sense disambiguation (WSD) using an exemplar-based learning algorithm. This approach integrates a diverse set of knowledge sources to disambiguate word sense, including part of speech of neighboring words, morphological form, the unordered set of surrounding words, local collocations, and verb-object syntactic relation. We tested our WSD program, named {\sc Lexas}, on both a common data set used in previous work, as well as on a large sense-tagged corpus that we separately constructed. {\sc Lexas} achieves a higher accuracy on the common data set, and performs better than the most frequent heuristic on the highly ambiguous words in the large corpus tagged with the refined senses of {\sc WordNet}.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
