The risk of sub-optimal use of Open Source NLP Software: UKB is inadvertently state-of-the-art in knowledge-based WSD
Eneko Agirre, Oier L\'opez de Lacalle, Aitor Soroa

TL;DR
This paper demonstrates that UKB, an open source NLP tool for knowledge-based WSD, has become the state-of-the-art after nine years, highlighting issues with default settings and reproducibility in open source NLP software.
Contribution
The paper reveals that UKB's sub-optimal default use led to it becoming the current best method for knowledge-based WSD, emphasizing the importance of proper configuration.
Findings
UKB is now the state-of-the-art in knowledge-based WSD.
Sub-optimal default settings can lead to unexpected performance gains.
Reproducibility issues arise from lack of optimal default configurations.
Abstract
UKB is an open source collection of programs for performing, among other tasks, knowledge-based Word Sense Disambiguation (WSD). Since it was released in 2009 it has been often used out-of-the-box in sub-optimal settings. We show that nine years later it is the state-of-the-art on knowledge-based WSD. This case shows the pitfalls of releasing open source NLP software without optimal default settings and precise instructions for reproducibility.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
