# Updating the German Psycholinguistic Word Toolbox with AI-Generated Estimates of Concreteness, Valence, Arousal, Age of Acquisition, and Familiarity

**Authors:** Javier Conde, Gonzalo Martínez, María Grandury, Carlos Arriaga, Juan Haro, Sascha Schroeder, Florian Hintz, Pedro Reviriego, Marc Brysbaert

PMC · DOI: 10.5334/joc.482 · Journal of Cognition · 2026-01-08

## TL;DR

This paper updates a German word toolbox with AI-generated estimates for word characteristics like concreteness and familiarity, validated against human ratings.

## Contribution

The novel contribution is the use of GPT-4o-mini to generate and validate new AI estimates for German words, including fine-tuned age of acquisition and familiarity ratings.

## Key findings

- GPT estimates for concreteness, valence, and arousal strongly correlate with human ratings but do not outperform existing AI methods.
- Fine-tuned GPT estimates for age of acquisition outperform other AI methods and approach human ratings in accuracy.
- AI-generated familiarity estimates improve prediction of word recognition in lexical tasks compared to word frequency.

## Abstract

This article presents AI-generated estimates for five characteristics of German words: concreteness, valence, arousal, age of acquisition (AoA), and word familiarity. The estimates were generated using GPT-4o-mini, which was selected due to its good performance in previous studies. Validation studies were conducted comparing the AI-generated estimates with both human ratings and previously generated AI data to ensure their usefulness for research applications. The main results are as follows. The GPT estimates of word concreteness, valence, and arousal show a strong correlation with human ratings but are not better than the best available AI-generated estimates based on semantic vectors. The GPT estimates of AoA are good approximations of human ratings and outperform other available alternatives (except for human ratings), especially after the model was fine-tuned based on 2,000 human ratings. Fine-tuned AI-generated estimates of word familiarity have better predictive value than word frequency for word recognition in lexical decision tasks and vocabulary tests. Estimates for concreteness, valence, arousal, and AoA are available for 167,000 words, which are likely to be known to more than 90% of participants in typical adult studies. Word familiarity estimates are presented for 928,000 word forms. All data and codes, including newly collected human familiarity ratings for 11,000 words, are publicly available at https://osf.io/ghjd2/. The data may be freely used for research purposes, but not for commercial purposes.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12785658/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12785658/full.md

## References

63 references — full list in the complete paper: https://tomesphere.com/paper/PMC12785658/full.md

---
Source: https://tomesphere.com/paper/PMC12785658