Contrastive Semantic Projection: Faithful Neuron Labeling with Contrastive Examples
Oussama Bouanani, Jim Berend, Wojciech Samek, Sebastian Lapuschkin, Maximilian Dreyer

TL;DR
This paper introduces Contrastive Semantic Projection (CSP), a novel method that uses contrastive examples with vision language models and CLIP encoders to produce more faithful and specific neuron labels in deep networks.
Contribution
It presents CSP, an extension of SemanticLens, integrating contrastive examples into neuron labeling to improve faithfulness and semantic detail.
Findings
Contrastive labeling yields more specific candidate labels.
CSP improves faithfulness and semantic granularity over baselines.
Experiments and case study demonstrate effectiveness in melanoma detection.
Abstract
Neuron labeling assigns textual descriptions to internal units of deep networks. Existing approaches typically rely on highly activating examples, often yielding broad or misleading labels by focusing on dominant but incidental visual factors. Prior work such as FALCON introduced contrastive examples -- inputs that are semantically similar to activating examples but elicit low activations -- to sharpen explanations, but it primarily addresses subspace-level interpretability rather than scalable neuron-level labeling. We revisit contrastive explanations for neuron-level labeling in two stages: (1) candidate label generation with vision language models (VLMs) and (2) label assignment with CLIP-like encoders. First, we show that providing contrastive image sets to VLMs yields candidate labels that are more specific and more faithful. Second, we introduce Contrastive Semantic Projection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
