An Exploratory Analysis on the Explanatory Potential of Embedding-Based Measures of Semantic Transparency for Malay Word Recognition
M. Maziyah Mohamed (1), R. H. Baayen (1) ((1) University of Tuebingen)

TL;DR
This study investigates embedding-based measures of semantic transparency in Malay word recognition, exploring their geometric properties and predictive power for lexical decision latencies, revealing the centroid similarity as a key predictor.
Contribution
It introduces novel embedding-based measures of semantic transparency and demonstrates their effectiveness in predicting lexical decision latencies in Malay.
Findings
Centroid similarity best predicts decision latencies.
Embedding measures reveal clusters of complex words by prefix class.
Semantic space geometry correlates with morphological processing.
Abstract
Studies of morphological processing have shown that semantic transparency is crucial for word recognition. Its computational operationalization is still under discussion. Our primary objectives are to explore embedding-based measures of semantic transparency, and assess their impact on reading. First, we explored the geometry of complex words in semantic space. To do so, we conducted a t-distributed Stochastic Neighbor Embedding clustering analysis on 4,226 Malay prefixed words. Several clusters were observed for complex words varied by their prefix class. Then, we derived five simple measures, and investigated whether they were significant predictors of lexical decision latencies. Two sets of Linear Discriminant Analyses were run in which the prefix of a word is predicted from either word embeddings or shift vectors (i.e., a vector subtraction of the base word from the derived word).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReading and Literacy Development · Text Readability and Simplification · Neurobiology of Language and Bilingualism
MethodsBalanced Selection
