Partial Colexifications Improve Concept Embeddings
Arne Rubehn, Johann-Mattis List

TL;DR
This paper demonstrates that incorporating partial colexifications into concept embeddings enhances their semantic accuracy and ability to capture diverse semantic relationships, benefiting computational linguistics tasks especially in low-resource and cross-linguistic contexts.
Contribution
It introduces a novel method for improving concept embeddings by leveraging partial colexifications, extending previous approaches that focused only on word-level relations.
Findings
Improved correlation with lexical similarity ratings
Enhanced detection of semantic shifts
Better representation of semantic relationships
Abstract
While the embedding of words has revolutionized the field of Natural Language Processing, the embedding of concepts has received much less attention so far. A dense and meaningful representation of concepts, however, could prove useful for several tasks in computational linguistics, especially those involving cross-linguistic data or sparse data from low resource languages. First methods that have been proposed so far embed concepts from automatically constructed colexification networks. While these approaches depart from automatically inferred polysemies, attested across a larger number of languages, they are restricted to the word level, ignoring lexical relations that would only hold for parts of the words in a given language. Building on recently introduced methods for the inference of partial colexifications, we show how they can be used to improve concept embeddings in meaningful…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsBayesian Modeling and Causal Inference
