On the Properties of Feature Attribution for Supervised Contrastive Learning
Leonardo Arrighi, Julia Eva Belloni, Aur\'elie Gallet, Ivan Gentile, Matteo Lippi, Marco Zullich

TL;DR
This paper empirically demonstrates that supervised contrastive learning (SCL) yields higher-quality feature attribution explanations than traditional contrastive learning, enhancing model transparency and trustworthiness.
Contribution
It provides the first empirical comparison showing SCL improves feature attribution quality over CL in image classification tasks.
Findings
SCL-trained models have more faithful feature attributions.
SCL models exhibit lower complexity in explanations.
SCL enhances the continuity of feature attributions.
Abstract
Most Neural Networks (NNs) for classification are trained using Cross-Entropy as a loss function. This approach requires the model to have an explicit classification layer. However, there exist alternative approaches, such as Contrastive Learning (CL). Instead of explicitly operating a classification, CL has the NN produce an embedding space where projections of similar data are pulled together, while projections of dissimilar data are pushed apart. In the case of Supervised CL (SCL), labels are adopted as similarity criteria, thus creating an embedding space where the projected data points are well-clustered. SCL provides crucial advantages over CE with regard to adversarial robustness and out-of-distribution detection, thus making it a more natural choice in safety-critical scenarios. In the present paper, we empirically show that NNs for image classification trained with SCL present…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
