Exploiting the Relationship Between Kendall's Rank Correlation and Cosine Similarity for Attribution Protection
Fan Wang, Adams Wai-Kin Kong

TL;DR
This paper reveals a positive correlation between Kendall's rank correlation and cosine similarity in attribution vectors, proposing a new regularizer to enhance attribution robustness against perturbations in neural networks.
Contribution
It introduces the integrated gradient regularizer (IGR) that maximizes cosine similarity between natural and perturbed attributions, improving attribution robustness.
Findings
Positive correlation between Kendall's rank correlation and cosine similarity.
IGR encourages neurons with consistent activation states, enhancing robustness.
Experiments show improved adversarial attribution protection across models and datasets.
Abstract
Model attributions are important in deep neural networks as they aid practitioners in understanding the models, but recent studies reveal that attributions can be easily perturbed by adding imperceptible noise to the input. The non-differentiable Kendall's rank correlation is a key performance index for attribution protection. In this paper, we first show that the expected Kendall's rank correlation is positively correlated to cosine similarity and then indicate that the direction of attribution is the key to attribution robustness. Based on these findings, we explore the vector space of attribution to explain the shortcomings of attribution defense methods using norm and propose integrated gradient regularizer (IGR), which maximizes the cosine similarity between natural and perturbed attributions. Our analysis further exposes that IGR encourages neurons with the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
