LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs
Benno Krojer, Shravan Nayak, Oscar Ma\~nas, Vaibhav Adlakha, Desmond Elliott, Siva Reddy, Marius Mosbach

TL;DR
LatentLens is a novel interpretability method that reveals the semantic content of visual tokens in large vision-language models by mapping their representations to natural language descriptions, demonstrating high interpretability across models and layers.
Contribution
This work introduces LatentLens, a new approach for interpreting visual tokens in VLMs by leveraging nearest neighbor search in a large text corpus, revealing their semantic meanings.
Findings
Most visual tokens are interpretable across models and layers.
LatentLens provides more meaningful and fine-grained descriptions than previous methods.
Common interpretability methods underestimate visual token interpretability.
Abstract
Transforming a large language model (LLM) into a Vision-Language Model (VLM) can be achieved by mapping the visual tokens from a vision encoder into the embedding space of an LLM. Intriguingly, this mapping can be as simple as a shallow MLP transformation. To understand why LLMs can so readily process visual tokens, we need interpretability methods that reveal what is encoded in the visual token representations at every layer of LLM processing. In this work, we introduce LatentLens, a novel approach for mapping latent representations to descriptions in natural language. LatentLens works by encoding a large text corpus and storing contextualized token representations for each token in that corpus. Visual token representations are then compared to their contextualized textual representations, with the top-k nearest neighbor representations providing descriptions of the visual token. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Language, Metaphor, and Cognition · Language and cultural evolution
