Indexing with WordNet synsets can improve Text Retrieval
Julio Gonzalo, Felisa Verdejo, Irina Chugur, Juan Cigarran (UNED,, Spain)

TL;DR
Using WordNet synsets for indexing in text retrieval significantly improves performance over traditional word form indexing, especially with manual disambiguation, but automatic disambiguation errors can impact results.
Contribution
This paper demonstrates that synset-based indexing enhances text retrieval effectiveness and explores the effects of disambiguation accuracy on performance.
Findings
Up to 29% improvement with synset indexing
Manual disambiguation yields better results
Automatic disambiguation errors affect retrieval quality
Abstract
The classical, vector space model for text retrieval is shown to give better results (up to 29% better in our experiments) if WordNet synsets are chosen as the indexing space, instead of word forms. This result is obtained for a manually disambiguated test collection (of queries and documents) derived from the Semcor semantic concordance. The sensitivity of retrieval performance to (automatic) disambiguation errors when indexing documents is also measured. Finally, it is observed that if queries are not disambiguated, indexing by synsets performs (at best) only as good as standard word indexing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies
