LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models

Hugo Pitorro; Marcos Treviso

arXiv:2502.15612·cs.CL·February 26, 2025

LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models

Hugo Pitorro, Marcos Treviso

PDF

Open Access 1 Video

TL;DR

LaTIM is a new interpretability method that decomposes token interactions in Mamba models, providing detailed insights into how these models process sequences across layers, which was previously lacking.

Contribution

We introduce LaTIM, a novel token-level decomposition technique for Mamba models, enhancing interpretability of token interactions in state space models.

Findings

01

LaTIM effectively reveals token-to-token interaction patterns.

02

It provides detailed insights across multiple tasks.

03

Enhances understanding of Mamba's internal mechanisms.

Abstract

State space models (SSMs), such as Mamba, have emerged as an efficient alternative to transformers for long-context sequence modeling. However, despite their growing adoption, SSMs lack the interpretability tools that have been crucial for understanding and improving attention-based architectures. While recent efforts provide insights into Mamba's internal mechanisms, they do not explicitly decompose token-wise contributions, leaving gaps in understanding how Mamba selectively processes sequences across layers. In this work, we introduce LaTIM, a novel token-level decomposition method for both Mamba-1 and Mamba-2 that enables fine-grained interpretability. We extensively evaluate our method across diverse tasks, including machine translation, copying, and retrieval-based generation, demonstrating its effectiveness in revealing Mamba's token-to-token interaction patterns.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces