Harmonizing Feature Attributions Across Deep Learning Architectures:   Enhancing Interpretability and Consistency

Md Abdul Kadir; Gowtham Krishna Addluri; Daniel Sonntag

arXiv:2307.02150·cs.LG·September 20, 2023

Harmonizing Feature Attributions Across Deep Learning Architectures: Enhancing Interpretability and Consistency

Md Abdul Kadir, Gowtham Krishna Addluri, Daniel Sonntag

PDF

Open Access

TL;DR

This paper investigates how to harmonize feature attribution methods across different deep learning architectures like CNNs and transformers to improve interpretability and consistency of explanations in machine learning models.

Contribution

It introduces a method for harmonizing feature attributions across diverse architectures, enhancing the reliability of local explanations in deep learning models.

Findings

01

Harmonized attributions improve interpretability across architectures

02

Method increases consistency of feature importance explanations

03

Enhances trust in model predictions regardless of architecture

Abstract

Ensuring the trustworthiness and interpretability of machine learning models is critical to their deployment in real-world applications. Feature attribution methods have gained significant attention, which provide local explanations of model predictions by attributing importance to individual input features. This study examines the generalization of feature attributions across various deep learning architectures, such as convolutional neural networks (CNNs) and vision transformers. We aim to assess the feasibility of utilizing a feature attribution method as a future detector and examine how these features can be harmonized across multiple models employing distinct architectures but trained on the same data distribution. By exploring this harmonization, we aim to develop a more coherent and optimistic understanding of feature attributions, enhancing the consistency of local explanations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning in Materials Science