LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs

Benno Krojer; Shravan Nayak; Oscar Ma\~nas; Vaibhav Adlakha; Desmond Elliott; Siva Reddy; Marius Mosbach

arXiv:2602.00462·cs.CV·February 26, 2026

LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs

Benno Krojer, Shravan Nayak, Oscar Ma\~nas, Vaibhav Adlakha, Desmond Elliott, Siva Reddy, Marius Mosbach

PDF

Open Access 1 Models

TL;DR

LatentLens is a novel interpretability method that reveals the semantic content of visual tokens in large vision-language models by mapping their representations to natural language descriptions, demonstrating high interpretability across models and layers.

Contribution

This work introduces LatentLens, a new approach for interpreting visual tokens in VLMs by leveraging nearest neighbor search in a large text corpus, revealing their semantic meanings.

Findings

01

Most visual tokens are interpretable across models and layers.

02

LatentLens provides more meaningful and fine-grained descriptions than previous methods.

03

Common interpretability methods underestimate visual token interpretability.

Abstract

Transforming a large language model (LLM) into a Vision-Language Model (VLM) can be achieved by mapping the visual tokens from a vision encoder into the embedding space of an LLM. Intriguingly, this mapping can be as simple as a shallow MLP transformation. To understand why LLMs can so readily process visual tokens, we need interpretability methods that reveal what is encoded in the visual token representations at every layer of LLM processing. In this work, we introduce LatentLens, a novel approach for mapping latent representations to descriptions in natural language. LatentLens works by encoding a large text corpus and storing contextualized token representations for each token in that corpus. Visual token representations are then compared to their contextualized textual representations, with the top-k nearest neighbor representations providing descriptions of the visual token. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
McGill-NLP/latentlens-connectors
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Language, Metaphor, and Cognition · Language and cultural evolution