Decision Lists for English and Basque
Eneko Agirre, David Martinez

TL;DR
This paper presents supervised decision list systems for English and Basque word sense disambiguation, utilizing novel feature extraction methods and automatic feature selection to improve precision.
Contribution
It introduces new feature sets for Basque using a morphological analyzer and automatic feature selection, enhancing decision list performance.
Findings
Achieved 85% precision with feature selection
Developed language-specific feature sets
Demonstrated effectiveness of decision lists for disambiguation
Abstract
In this paper we describe the systems we developed for the English (lexical and all-words) and Basque tasks. They were all supervised systems based on Yarowsky's Decision Lists. We used Semcor for training in the English all-words task. We defined different feature sets for each language. For Basque, in order to extract all the information from the text, we defined features that have not been used before in the literature, using a morphological analyzer. We also implemented systems that selected automatically good features and were able to obtain a prefixed precision (85%) at the cost of coverage. The systems that used all the features were identified as BCU-ehu-dlist-all and the systems that selected some features as BCU-ehu-dlist-best.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
