Latent Structure of Affective Representations in Large Language Models
Benjamin J. Choi, Melanie Weber

TL;DR
This paper investigates the geometric structure of affective representations in large language models, revealing alignment with psychological models, linear approximability, and utility for uncertainty quantification.
Contribution
It demonstrates that LLMs learn affective representations consistent with psychological models, supporting linearity assumptions and enabling uncertainty measurement.
Findings
LLMs learn affective representations aligned with valence-arousal models.
Representations exhibit nonlinear but linearly approximable geometric structure.
Latent space can be used to quantify uncertainty in emotion tasks.
Abstract
The geometric structure of latent representations in large language models (LLMs) is an active area of research, driven in part by its implications for model transparency and AI safety. Existing literature has focused mainly on general geometric and topological properties of the learnt representations, but due to a lack of ground-truth latent geometry, validating the findings of such approaches is challenging. Emotion processing provides an intriguing testbed for probing representational geometry, as emotions exhibit both categorical organization and continuous affective dimensions, which are well-established in the psychology literature. Moreover, understanding such representations carries safety relevance. In this work, we investigate the latent structure of affective representations in LLMs using geometric data analysis tools. We present three main findings. First, we show that LLMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
