TL;DR
LEXpander introduces a novel lexicon expansion method using colexification networks, outperforming existing approaches in precision and recall, and effectively generating comprehensive word lists for text analysis in multiple languages.
Contribution
This paper presents LEXpander, a new lexicon expansion technique leveraging colexification data, providing a systematic and high-performing alternative to existing methods.
Findings
LEXpander outperforms existing lexicon expansion methods in precision and recall.
Expanded word lists from LEXpander are effective in various text analysis applications.
The method works well across different linguistic categories and languages, including English and German.
Abstract
Recent approaches to text analysis from social media and other corpora rely on word lists to detect topics, measure meaning, or to select relevant documents. These lists are often generated by applying computational lexicon expansion methods to small, manually-curated sets of root words. Despite the wide use of this approach, we still lack an exhaustive comparative analysis of the performance of lexicon expansion methods and how they can be improved with additional linguistic data. In this work, we present LEXpander, a method for lexicon expansion that leverages novel data on colexification, i.e. semantic networks connecting words based on shared concepts and translations to other languages. We evaluate LEXpander in a benchmark including widely used methods for lexicon expansion based on various word embedding models and synonym networks. We find that LEXpander outperforms existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
