Interpreting token compositionality in LLMs: A robustness analysis
Nura Aljaafari, Danilo S. Carvalho, Andr\'e Freitas

TL;DR
This paper introduces Constituent-Aware Pooling (CAP), a method to analyze how large language models process compositional structures, revealing limitations in their ability to form unified semantic representations and highlighting the need for improved architectures.
Contribution
We propose CAP, a novel interpretability technique that systematically examines how transformers handle compositional linguistic structures, uncovering their fragmentation and limitations.
Findings
Transformers show fragmented information processing for compositional tasks.
Larger models tend to process information more fragmentarily.
Current architectures struggle with integrating tokens into cohesive semantic representations.
Abstract
Understanding the internal mechanisms of large language models (LLMs) is integral to enhancing their reliability, interpretability, and inference processes. We present Constituent-Aware Pooling (CAP), a methodology designed to analyse how LLMs process compositional linguistic structures. Grounded in principles of compositionality, mechanistic interpretability, and information theory, CAP systematically intervenes in model activations through constituent-based pooling at various model levels. Our experiments on inverse definition modelling, hypernym and synonym prediction reveal critical insights into transformers' limitations in handling compositional abstractions. No specific layer integrates tokens into unified semantic representations based on their constituent parts. We observe fragmented information processing, which intensifies with model size, suggesting that larger models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLibrary Science and Information Systems · Semantic Web and Ontologies
MethodsFragmentation
