Using large language models to estimate features of multi-word expressions: Concreteness, valence, arousal
Gonzalo Mart\'inez, Juan Diego Molero, Sandra Gonz\'alez, Javier, Conde, Marc Brysbaert, Pedro Reviriego

TL;DR
This paper demonstrates that large language models, specifically ChatGPT-4o, can accurately estimate psycholinguistic features like concreteness, valence, and arousal for multi-word expressions, providing valuable data for linguistic research.
Contribution
The study introduces a systematic evaluation of ChatGPT-4o's ability to predict psycholinguistic features of multi-word expressions, outperforming previous AI models and offering large-scale datasets.
Findings
ChatGPT-4o shows strong correlation with human ratings for concreteness (r = .8).
AI models match or outperform previous methods in predicting valence and arousal.
Large datasets of AI-generated psycholinguistic norms are provided for research use.
Abstract
This study investigates the potential of large language models (LLMs) to provide accurate estimates of concreteness, valence and arousal for multi-word expressions. Unlike previous artificial intelligence (AI) methods, LLMs can capture the nuanced meanings of multi-word expressions. We systematically evaluated ChatGPT-4o's ability to predict concreteness, valence and arousal. In Study 1, ChatGPT-4o showed strong correlations with human concreteness ratings (r = .8) for multi-word expressions. In Study 2, these findings were repeated for valence and arousal ratings of individual words, matching or outperforming previous AI models. Study 3 extended the prevalence and arousal analysis to multi-word expressions and showed promising results despite the lack of large-scale human benchmarks. These findings highlight the potential of LLMs for generating valuable psycholinguistic data related to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
