Interpreting token compositionality in LLMs: A robustness analysis

Nura Aljaafari; Danilo S. Carvalho; Andr\'e Freitas

arXiv:2410.12924·cs.CL·May 21, 2025

Interpreting token compositionality in LLMs: A robustness analysis

Nura Aljaafari, Danilo S. Carvalho, Andr\'e Freitas

PDF

Open Access

TL;DR

This paper introduces Constituent-Aware Pooling (CAP), a method to analyze how large language models process compositional structures, revealing limitations in their ability to form unified semantic representations and highlighting the need for improved architectures.

Contribution

We propose CAP, a novel interpretability technique that systematically examines how transformers handle compositional linguistic structures, uncovering their fragmentation and limitations.

Findings

01

Transformers show fragmented information processing for compositional tasks.

02

Larger models tend to process information more fragmentarily.

03

Current architectures struggle with integrating tokens into cohesive semantic representations.

Abstract

Understanding the internal mechanisms of large language models (LLMs) is integral to enhancing their reliability, interpretability, and inference processes. We present Constituent-Aware Pooling (CAP), a methodology designed to analyse how LLMs process compositional linguistic structures. Grounded in principles of compositionality, mechanistic interpretability, and information theory, CAP systematically intervenes in model activations through constituent-based pooling at various model levels. Our experiments on inverse definition modelling, hypernym and synonym prediction reveal critical insights into transformers' limitations in handling compositional abstractions. No specific layer integrates tokens into unified semantic representations based on their constituent parts. We observe fragmented information processing, which intensifies with model size, suggesting that larger models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLibrary Science and Information Systems · Semantic Web and Ontologies

MethodsFragmentation