VL-SAE: Interpreting and Enhancing Vision-Language Alignment with a Unified Concept Set

Shufan Shen; Junshu Sun; Qingming Huang; Shuhui Wang

arXiv:2510.21323·cs.CV·October 27, 2025

VL-SAE: Interpreting and Enhancing Vision-Language Alignment with a Unified Concept Set

Shufan Shen, Junshu Sun, Qingming Huang, Shuhui Wang

PDF

1 Models

TL;DR

VL-SAE introduces a unified concept set to interpret and improve vision-language model alignments by correlating neurons with semantic concepts, enhancing interpretability and downstream task performance.

Contribution

The paper proposes VL-SAE, a sparse autoencoder that maps vision-language representations to a concept set, enabling interpretability and alignment enhancement in VLMs.

Findings

01

VL-SAE effectively interprets vision-language representations.

02

VL-SAE improves zero-shot image classification accuracy.

03

VL-SAE reduces hallucinations in VLM outputs.

Abstract

The alignment of vision-language representations endows current Vision-Language Models (VLMs) with strong multi-modal reasoning capabilities. However, the interpretability of the alignment component remains uninvestigated due to the difficulty in mapping the semantics of multi-modal representations into a unified concept set. To address this problem, we propose VL-SAE, a sparse autoencoder that encodes vision-language representations into its hidden activations. Each neuron in its hidden layer correlates to a concept represented by semantically similar images and texts, thereby interpreting these representations with a unified concept set. To establish the neuron-concept correlation, we encourage semantically similar representations to exhibit consistent neuron activations during self-supervised training. First, to measure the semantic similarity of multi-modal representations, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
shufanshen/VL-SAE
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.