Towards Interrogating Discriminative Machine Learning Models
Wenbo Guo, Kaixuan Zhang, Lin Lin, Sui Huang, Xinyu Xing

TL;DR
This paper introduces a novel approach combining Bayesian regression mixtures with elastic nets to provide comprehensive explanations of machine learning models, improving interpretability and vulnerability detection across tasks.
Contribution
It proposes a new method that enhances model interpretability by globally approximating models and identifying vulnerabilities, surpassing existing explanation techniques.
Findings
Outperforms state-of-the-art in explaining individual decisions
Enables discovery of model vulnerabilities
Effective across text and image tasks
Abstract
It is oftentimes impossible to understand how machine learning models reach a decision. While recent research has proposed various technical approaches to provide some clues as to how a learning model makes individual decisions, they cannot provide users with ability to inspect a learning model as a complete entity. In this work, we propose a new technical approach that augments a Bayesian regression mixture model with multiple elastic nets. Using the enhanced mixture model, we extract explanations for a target model through global approximation. To demonstrate the utility of our approach, we evaluate it on different learning models covering the tasks of text mining and image recognition. Our results indicate that the proposed approach not only outperforms the state-of-the-art technique in explaining individual decisions but also provides users with an ability to discover the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Explainable Artificial Intelligence (XAI) · Machine Learning in Healthcare
