Emotions Where Art Thou: Understanding and Characterizing the Emotional Latent Space of Large Language Models

Benjamin Reichman; Adar Avsian; Larry Heck

arXiv:2510.22042·cs.CL·February 2, 2026

Emotions Where Art Thou: Understanding and Characterizing the Emotional Latent Space of Large Language Models

Benjamin Reichman, Adar Avsian, Larry Heck

PDF

3 Reviews

TL;DR

This paper uncovers a low-dimensional, stable, and universal emotional manifold within large language models, enabling effective emotion manipulation across languages and domains.

Contribution

It reveals the existence of a stable, interpretable emotional subspace in LLMs and introduces a method to steer internal emotion perception while preserving semantics.

Findings

01

Emotional representations form a low-dimensional manifold.

02

The emotional space is stable across model layers and languages.

03

Intervention methods effectively control emotions without altering semantics.

Abstract

This work investigates how large language models (LLMs) internally represent emotion by analyzing the geometry of their hidden-state space. The paper identifies a low-dimensional emotional manifold and shows that emotional representations are directionally encoded, distributed across layers, and aligned with interpretable dimensions. These structures are stable across depth and generalize to eight real-world emotion datasets spanning five languages. Cross-domain alignment yields low error and strong linear probe performance, indicating a universal emotional subspace. Within this space, internal emotion perception can be steered while preserving semantics using a learned intervention module, with especially strong control for basic emotions across languages. These findings reveal a consistent and manipulable affective geometry in LLMs and offer insight into how they internalize and…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 3

Strengths

- Identification of a low-dimensional, directionally encoded emotional manifold: The paper demonstrates that emotions in LLMs occupy a low-dimensional subspace that is interpretable and directionally organized across layers, with principal axes (PC1–PC3) showing high rank correlations in many models/layers. - Cross-dataset and multilingual generalization of emotional structure: Using eight emotion datasets spanning five languages and diverse textual styles, the authors show that the extracted em

Weaknesses

- Geometry vs. local distortion — inconsistent relational preservation: Although global alignment measures (cosine, regression) are often strong, stress and distortion analyses reveal notable local warping of relative geometry in many layers and datasets. Thus the emotional manifold is not uniformly faithful to human emotion-space relations, which complicates interpretation and downstream use. - Uneven multilingual and dataset robustness: Performance and steerability degrade in lower-resource se

Reviewer 02Rating 6Confidence 2

Strengths

- The paper presents a comprehensive, cross-lingual study covering eight datasets in five languages, offering strong evidence for the universality of LLM emotion representations. - The use of ML-AURA and SVD-based analyses provides a rigorous and interpretable framework for linking internal neuron activity to affective semantics. - The learned steering module demonstrates practical control of emotion representations while preserving meaning, which is an innovative advance beyond descriptive anal

Weaknesses

- While broad in scope, the work is methodologically complex, and the abundance of metrics (stress, distortion, spectral flatness, etc.) may obscure key takeaways. - The evaluation relies heavily on synthetic emotion text for subspace construction, which may bias the identified directions. - Although the paper claims semantic preservation under steering, this is mostly supported by cosine similarity metrics rather than human evaluations.

Reviewer 03Rating 6Confidence 4

Strengths

This paper presents the first investigation on emotional latent space representations in LLMs, and I believe the techniques used are novel and interesting. The authors provide analysis on many different perspectives, styles, and languages, which adds to the robustness of their findings. The new steering method introduced presents a new way to consider how to change emotions: by focusing on changing the underlying emotional subspace rather than focusing on downstream output.

Weaknesses

It is unclear exactly what Table 2 is measuring, specifically in regards to cosine similarity and MSE. The paper states high cosine similarity between emotions in real datasets and their synthetic counterparts: what synthetic counterpart are we referring to? Does this reference the Reichman et al. synthetic dataset? Table 2 does not mention which method it utilized as well. What does it mean to measure the cosine similarity of an emotion between two datasets, what datum from each dataset is actu

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.