From Weights to Concepts: Data-Free Interpretability of CLIP via Singular Vector Decomposition

Francesco Gentile; Nicola Dall'Asen; Francesco Tonini; Massimiliano Mancini; Lorenzo Vaquero; Elisa Ricci

arXiv:2603.24653·cs.CV·March 27, 2026

From Weights to Concepts: Data-Free Interpretability of CLIP via Singular Vector Decomposition

Francesco Gentile, Nicola Dall'Asen, Francesco Tonini, Massimiliano Mancini, Lorenzo Vaquero, Elisa Ricci

PDF

Open Access

TL;DR

This paper introduces SITH, a data-free framework for interpreting CLIP's vision transformer by decomposing attention heads into semantically meaningful concepts, enabling faithful explanations and model edits without retraining.

Contribution

SITH is the first fully data-free, training-free method to interpret CLIP's weights via singular vector decomposition and concept-based explanations, facilitating precise model editing.

Findings

01

SITH provides coherent, faithful intra-head explanations.

02

Model fine-tuning reweights existing semantic bases rather than learning new features.

03

SITH enables concept-based model edits that improve downstream performance.

Abstract

As vision-language models are deployed at scale, understanding their internal mechanisms becomes increasingly critical. Existing interpretability methods predominantly rely on activations, making them dataset-dependent, vulnerable to data bias, and often restricted to coarse head-level explanations. We introduce SITH (Semantic Inspection of Transformer Heads), a fully data-free, training-free framework that directly analyzes CLIP's vision transformer in weight space. For each attention head, we decompose its value-output matrix into singular vectors and interpret each one via COMP (Coherent Orthogonal Matching Pursuit), a new algorithm that explains them as sparse, semantically coherent combinations of human-interpretable concepts. We show that SITH yields coherent, faithful intra-head explanations, validated through reconstruction fidelity and interpretability experiments. This allows…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis