Extracting and Steering Emotion Representations in Small Language Models: A Methodological Comparison

Jihoon Jeong

arXiv:2604.04064·cs.CL·April 7, 2026

Extracting and Steering Emotion Representations in Small Language Models: A Methodological Comparison

Jihoon Jeong

PDF

TL;DR

This paper compares methods for extracting and steering emotion representations in small language models, revealing insights into their internal emotional structure, effects of architecture, and cross-lingual entanglement.

Contribution

It provides the first comparative analysis of emotion extraction methods in small language models, highlighting the superiority of generation-based extraction and detailing how emotion representations are localized and manipulated.

Findings

01

Generation-based extraction yields better emotion separation.

02

Emotion representations localize at middle transformer layers (~50%).

03

Steering experiments confirm causal effects and reveal three operational regimes.

Abstract

Small language models (SLMs) in the 100M-10B parameter range increasingly power production systems, yet whether they possess the internal emotion representations recently discovered in frontier models remains unknown. We present the first comparative analysis of emotion vector extraction methods for SLMs, evaluating 9 models across 5 architectural families (GPT-2, Gemma, Qwen, Llama, Mistral) using 20 emotions and two extraction methods (generation-based and comprehension-based). Generation-based extraction produces statistically superior emotion separation (Mann-Whitney p = 0.007; Cohen's d = -107.5), with the advantage modulated by instruction tuning and architecture. Emotion representations localize at middle transformer layers (~50% depth), following a U-shaped curve that is architecture-invariant from 124M to 3B parameters. We validate these findings against representational…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.