Features of word similarity
Arthur M. Jacobs, Annette Kinder

TL;DR
This paper compares various computational models of word similarity and association, evaluating their ability to predict human rating data and highlighting the complexity of factors involved beyond semantic relatedness.
Contribution
It systematically assesses 28 models combining surface and semantic features, providing evidence that word similarity involves more than semantic relatedness alone.
Findings
Models show limited cross-validated performance
Word similarity ratings are influenced by multiple factors
Development of psychological process models is needed
Abstract
In this theoretical note we compare different types of computational models of word similarity and association in their ability to predict a set of about 900 rating data. Using regression and predictive modeling tools (neural net, decision tree) the performance of a total of 28 models using different combinations of both surface and semantic word features is evaluated. The results present evidence for the hypothesis that word similarity ratings are based on more than only semantic relatedness. The limited cross-validated performance of the models asks for the development of psychological process models of the word similarity rating task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Intelligent Tutoring Systems and Adaptive Learning
