Post-Hoc Understanding of Metaphor Processing in Decoder-Only Language Models via Conditional Scale Entropy

Lawhori Chakrabarti; Jennifer Johnson-Leung; Bert Baumgaertner; Aleksandar Vakanski; Min Xian; Boyu Zhang

arXiv:2605.21391·cs.CL·May 21, 2026

Post-Hoc Understanding of Metaphor Processing in Decoder-Only Language Models via Conditional Scale Entropy

Lawhori Chakrabarti, Jennifer Johnson-Leung, Bert Baumgaertner, Aleksandar Vakanski, Min Xian, Boyu Zhang

PDF

TL;DR

This paper introduces conditional scale entropy (CSE), a wavelet-based measure, to analyze how decoder-only language models process metaphors across layers, revealing a consistent spectral breadth signature for metaphorical tokens.

Contribution

The study develops CSE as a novel interpretability tool and demonstrates its effectiveness in identifying metaphor processing signatures across various transformer architectures.

Findings

01

Metaphorical tokens show higher spectral breadth than literal tokens across models.

02

The CSE effect persists across different model sizes and architectures.

03

The spectral breadth signature is not due to semantic complexity or propositional content.

Abstract

Metaphor requires a language model to resolve a token whose contextual meaning diverges from its basic literal sense. Understanding how transformer models organize this reinterpretation across depth remains an open problem in mechanistic interpretability. We introduce conditional scale entropy (CSE), a wavelet-derived measure of how broadly transformer computation engages across frequency scales at each layer position. Two theorems establish that CSE is invariant to update magnitude, isolating the structural pattern of updates from their intensity. Using CSE, we find that metaphorical tokens produce significantly higher spectral breadth than literal tokens at contiguous layer positions on every decoder-only architecture tested, from 124M to 20B parameters (GPT-2 family, LLaMA-2 7B, GPT-oss 20B). The effect survives cluster-based permutation correction, recurs in the early-to-mid…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.