Make it SING: Analyzing Semantic Invariants in Classifiers
Harel Yadid, Meir Yossef Levi, Roy Betser, Guy Gilboa

TL;DR
This paper introduces SING, a method for interpreting the semantic content of invariants in classifiers by mapping null-space variations to human-understandable descriptions, revealing differences in how models preserve class semantics.
Contribution
SING provides a novel approach to interpret and visualize the semantic invariants in classifiers using multi-modal models, bridging the gap between geometric invariants and human-understandable semantics.
Findings
ResNet50 leaks relevant semantic attributes to null space
DinoViT better maintains class semantics across invariants
SING can analyze local and global invariants in classifiers
Abstract
All classifiers, including state-of-the-art vision models, possess invariants, partially rooted in the geometry of their linear mappings. These invariants, which reside in the null-space of the classifier, induce equivalent sets of inputs that map to identical outputs. The semantic content of these invariants remains vague, as existing approaches struggle to provide human-interpretable information. To address this gap, we present Semantic Interpretation of the Null-space Geometry (SING), a method that constructs equivalent images, with respect to the network, and assigns semantic interpretations to the available variations. We use a mapping from network features to multi-modal vision language models. This allows us to obtain natural language descriptions and visual examples of the induced semantic shifts. SING can be applied to a single image, uncovering local invariants, or to sets of…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
The method can be applied to a single image to uncover local invariants, as well as sets of images for statistical analysis. The method reveals important insights about different model architectures. E.g., authors show that ResNet50 leaks relevant semantic attributes to the null space, while DINO-ViT is better at maintaining class semantics across the invariant space. The paper demonstrates how this approach can help understand the robustness of different neural network architectures, diagnose
I am fascinated by the fact that Dino shows better results than other architectures. Why is that the case? What does that tell us about Dino? What does that tell us about SING as a framework? Unfortunately, the authors do not provide any type of explanation as to why that is the case. While it is helpful to have SING as a framework, if we do not understand the reason behind its results, it is difficult to truly understand and trust it. The paper assumes a linear approximation. I wonder why thi
- **Originality:** The proposed decomposition of classifier weights into principal and null components is an original perspective for interpretability. This approach can be useful to analyze the functional aspects of learned representations. - **Clarity of method:** The method overview figure is very helpful to understand the overall flow of the proposed framework.
- **Clarity of Experiments and Findings:** The description of the experimental setup and the interpretation of the results lack clarity. It is difficult to understand what specific questions each experiment aims to answer and what conclusions should be drawn from the presented findings. More detailed explanations and clearer connections between experimental results and their implications would greatly improve this section. - **Quality:** The paper does not directly verify that the decomposition
- The rationale is well defined and supported by previous works and some analysis of existing methods. - SING provides an interesting and valuable direction of research into model evaluation, assessment and interpretation that while simple, could be a nice contribution to the field. - The approach to study the null space for analysis the semantic meaning is a particularly novel and interesting direction that in this case seemingly provides unique and meaningful analysis of learnt model embeddin
**Major:** - From my understanding the proposed method relies on the correctness of the CLIP encoder, meaning that it is assumed that CLIP is effective at appropriately structuring the embedding in a semantically meaningful way. If CLIP is insufficient the method itself may before poorly. Some analysis of this issue would be a good study for robustness and generalisation to other vision-language models at different performance. - The rationale and analysis behind the metrics could be elaborated
- The analysis is sound in theory, and the presentation is very clear and interesting. - It is very nice to see the experiment on perturbed images. - The results are solid and the experiments are interestingly designed.
- I wonder how is this method different from other concept based methods such as TCAV [1]? And also any other concept bottleneck methodologies [2]. I believe a similar result or analysis could be done by applying those methods. - The work should have included more base ViTs such as DeiT [3] or CAIT [4] that is not trained within the vision language space. The author should also consider swin transformer [5]. - It would be interesting to extend the analysis into the class of Diffusion models.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Topological and Geometric Data Analysis
