Bridging Interpretability and Robustness Using LIME-Guided Model   Refinement

Navid Nayyem; Abdullah Rakin; Longwei Wang

arXiv:2412.18952·cs.LG·December 30, 2024

Bridging Interpretability and Robustness Using LIME-Guided Model Refinement

Navid Nayyem, Abdullah Rakin, Longwei Wang

PDF

Open Access

TL;DR

This paper introduces a LIME-guided model refinement framework that improves both interpretability and robustness of deep learning models by reducing reliance on misleading features, demonstrated through empirical results.

Contribution

It presents a novel approach that uses LIME explanations to systematically enhance model robustness and interpretability through iterative feature reliance mitigation.

Findings

01

Enhanced resistance to adversarial attacks

02

Improved model interpretability

03

Better generalization to out-of-distribution data

Abstract

This paper explores the intricate relationship between interpretability and robustness in deep learning models. Despite their remarkable performance across various tasks, deep learning models often exhibit critical vulnerabilities, including susceptibility to adversarial attacks, over-reliance on spurious correlations, and a lack of transparency in their decision-making processes. To address these limitations, we propose a novel framework that leverages Local Interpretable Model-Agnostic Explanations (LIME) to systematically enhance model robustness. By identifying and mitigating the influence of irrelevant or misleading features, our approach iteratively refines the model, penalizing reliance on these features during training. Empirical evaluations on multiple benchmark datasets demonstrate that LIME-guided refinement not only improves interpretability but also significantly enhances…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning in Healthcare