Analysis of Word Embeddings Using Fuzzy Clustering
Shahin Atakishiyev, Marek Z. Reformat

TL;DR
This paper investigates the application of fuzzy clustering algorithms to word embeddings, revealing their sensitivity to high-dimensional data and demonstrating how parameter tuning enhances their effectiveness in analyzing semantic similarities.
Contribution
It introduces a fuzzy clustering approach to analyze word embeddings, showing how parameter tuning improves clustering performance in high-dimensional spaces.
Findings
Fuzzy clustering algorithms are sensitive to high-dimensional data.
Parameter tuning significantly affects clustering performance.
Fuzzy clustering provides insights into word memberships across clusters.
Abstract
In data dominated systems and applications, a concept of representing words in a numerical format has gained a lot of attention. There are a few approaches used to generate such a representation. An interesting issue that should be considered is the ability of such representations - called embeddings - to imitate human-based semantic similarity between words. In this study, we perform a fuzzy-based analysis of vector representations of words, i.e., word embeddings. We use two popular fuzzy clustering algorithms on count-based word embeddings, known as GloVe, of different dimensionality. Words from WordSim-353, called the gold standard, are represented as vectors and clustered. The results indicate that fuzzy clustering algorithms are very sensitive to high-dimensional data, and parameter tuning can dramatically change their performance. We show that by adjusting the value of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Text and Document Classification Technologies · Topic Modeling
MethodsGloVe Embeddings
