Model Choices Influence Attributive Word Associations: A Semi-supervised Analysis of Static Word Embeddings
Geetanjali Bihani, Julia Taylor Rayz

TL;DR
This study investigates how different static word embedding architectures and training choices influence the encoding of attributive word associations, revealing significant model-dependent variations and sensitivities.
Contribution
It introduces a semi-supervised clustering approach to analyze attributive word associations across multiple embedding models, highlighting the impact of training procedures and corpora.
Findings
Context learning flavor affects association distinguishability
Significant inter-model disparity in word associations
Embedding architecture influences association patterns
Abstract
Static word embeddings encode word associations, extensively utilized in downstream NLP tasks. Although prior studies have discussed the nature of such word associations in terms of biases and lexical regularities captured, the variation in word associations based on the embedding training procedure remains in obscurity. This work aims to address this gap by assessing attributive word associations across five different static word embedding architectures, analyzing the impact of the choice of the model architecture, context learning flavor and training corpora. Our approach utilizes a semi-supervised clustering method to cluster annotated proper nouns and adjectives, based on their word embedding features, revealing underlying attributive word associations formed in the embedding space, without introducing any confirmation bias. Our results reveal that the choice of the context learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
