Beyond Semantics: Disentangling Information Scope in Sparse Autoencoders for CLIP
Yusung Ro, Jaehyun Choi, Junmo Kim

TL;DR
This paper introduces the concept of information scope in Sparse Autoencoders for CLIP, distinguishing features by their spatial aggregation level and proposing the Contextual Dependency Score to quantify this property.
Contribution
It presents the novel concept of information scope, a new interpretability dimension, and the CDS metric to analyze how features influence CLIP's predictions.
Findings
Features with different scopes have distinct impacts on CLIP's outputs.
Some SAE features are stable across spatial perturbations, others are not.
Information scope is a key axis for understanding CLIP representations.
Abstract
Sparse Autoencoders (SAEs) have emerged as a powerful tool for interpreting the internal representations of CLIP vision encoders, yet existing analyses largely focus on the semantic meaning of individual features. We introduce information scope as a complementary dimension of interpretability that characterizes how broadly an SAE feature aggregates visual evidence, ranging from localized, patch-specific cues to global, image-level signals. We observe that some SAE features respond consistently across spatial perturbations, while others shift unpredictably with minor input changes, indicating a fundamental distinction in their underlying scope. To quantify this, we propose the Contextual Dependency Score (CDS), which separates positionally stable local scope features from positionally variant global scope features. Our experiments show that features of different information scopes exert…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
