From Associations to Activations: Comparing Behavioral and Hidden-State Semantic Geometry in LLMs
Louis Schiekiera, Max Zimmer, Christophe Roux, Sebastian Pokutta, Fritz G\"unther

TL;DR
This study compares how language models' internal semantic representations relate to their observable behavior in psycholinguistic tasks, revealing that certain behaviors reflect underlying hidden states more than others.
Contribution
It demonstrates that behavioral data, especially from forced-choice tasks, can recover and predict the semantic geometry of hidden states in large language models.
Findings
Forced-choice behavior aligns more with hidden-state geometry than free association.
Behavioral similarity predicts unseen hidden-state similarities beyond lexical baselines.
Behavioral tasks can reveal internal semantic structures of language models.
Abstract
We investigate the extent to which an LLM's hidden-state geometry can be recovered from its behavior in psycholinguistic experiments. Across eight instruction-tuned transformer models, we run two experimental paradigms -- similarity-based forced choice and free association -- over a shared 5,000-word vocabulary, collecting 17.5M+ trials to build behavior-based similarity matrices. Using representational similarity analysis, we compare behavioral geometries to layerwise hidden-state similarity and benchmark against FastText, BERT, and cross-model consensus. We find that forced-choice behavior aligns substantially more with hidden-state geometry than free association. In a held-out-words regression, behavioral similarity (especially forced choice) predicts unseen hidden-state similarities beyond lexical baselines and cross-model consensus, indicating that behavior-only measurements retain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeurobiology of Language and Bilingualism · Natural Language Processing Techniques · Topic Modeling
