Towards Visually Explaining Similarity Models
Meng Zheng, Srikrishna Karanam, Terrence Chen, Richard J., Radke, Ziyan Wu

TL;DR
This paper introduces a gradient-based visual explanation method for similarity models that do not rely on classification, enabling interpretability and improved performance across various image tasks.
Contribution
The work presents a novel approach to generate visual explanations for similarity models using only feature embeddings, applicable to any CNN-based architecture.
Findings
Attention maps enhance interpretability of similarity models.
Incorporating explanations improves model performance.
Method is effective across multiple image tasks.
Abstract
We consider the problem of visually explaining similarity models, i.e., explaining why a model predicts two images to be similar in addition to producing a scalar score. While much recent work in visual model interpretability has focused on gradient-based attention, these methods rely on a classification module to generate visual explanations. Consequently, they cannot readily explain other kinds of models that do not use or need classification-like loss functions (e.g., similarity models trained with a metric learning loss). In this work, we bridge this crucial gap, presenting a method to generate gradient-based visual attention for image similarity predictors. By relying solely on the learned feature embedding, we show that our approach can be applied to any kind of CNN-based similarity architecture, an important step towards generic visual explainability. We show that our resulting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Anomaly Detection Techniques and Applications · Video Analysis and Summarization
MethodsInterpretability
