Beyond Accuracy: Uncovering the Role of Similarity Perception and its Alignment with Semantics in Supervised Learning
Katarzyna Filus, Mateusz \.Zarski

TL;DR
This paper introduces Deep Similarity Inspector (DSI), a framework to analyze how deep vision models develop and align their similarity perception with semantic understanding during training.
Contribution
It systematically investigates the emergence and evolution of similarity perception in CNNs and ViTs, revealing distinct development phases and refinement phenomena.
Findings
Both CNNs and ViTs develop rich similarity perception during training.
Distinct phases: initial surge, refinement, stabilization.
Refinement phenomenon observed in mistake correction.
Abstract
Similarity manifests in various forms, including semantic similarity that is particularly important, serving as an approximation of human object categorization based on e.g. shared functionalities and evolutionary traits. It also offers practical advantages in computational modeling via lexical structures such as WordNet with constant and interpretable similarity. As in the domain of deep vision, there is still not enough focus on the phenomena regarding the similarity perception emergence. We introduce Deep Similarity Inspector (DSI) -- a systematic framework to inspect how deep vision networks develop their similarity perception and its alignment with semantic similarity. Our experiments show that both Convolutional Neural Networks' (CNNs) and Vision Transformers' (ViTs) develop a rich similarity perception during training with 3 phases (initial similarity surge, refinement,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCognitive Science and Education Research
