Post-Hoc Understanding of Metaphor Processing in Decoder-Only Language Models via Conditional Scale Entropy
Lawhori Chakrabarti, Jennifer Johnson-Leung, Bert Baumgaertner, Aleksandar Vakanski, Min Xian, Boyu Zhang

TL;DR
This paper introduces conditional scale entropy (CSE), a wavelet-based measure, to analyze how decoder-only language models process metaphors across layers, revealing a consistent spectral breadth signature for metaphorical tokens.
Contribution
The study develops CSE as a novel interpretability tool and demonstrates its effectiveness in identifying metaphor processing signatures across various transformer architectures.
Findings
Metaphorical tokens show higher spectral breadth than literal tokens across models.
The CSE effect persists across different model sizes and architectures.
The spectral breadth signature is not due to semantic complexity or propositional content.
Abstract
Metaphor requires a language model to resolve a token whose contextual meaning diverges from its basic literal sense. Understanding how transformer models organize this reinterpretation across depth remains an open problem in mechanistic interpretability. We introduce conditional scale entropy (CSE), a wavelet-derived measure of how broadly transformer computation engages across frequency scales at each layer position. Two theorems establish that CSE is invariant to update magnitude, isolating the structural pattern of updates from their intensity. Using CSE, we find that metaphorical tokens produce significantly higher spectral breadth than literal tokens at contiguous layer positions on every decoder-only architecture tested, from 124M to 20B parameters (GPT-2 family, LLaMA-2 7B, GPT-oss 20B). The effect survives cluster-based permutation correction, recurs in the early-to-mid…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
