Adversarial Infidelity Learning for Model Interpretation

Jian Liang; Bing Bai; Yuren Cao; Kun Bai; Fei Wang

arXiv:2006.05379·stat.ML·August 4, 2020

Adversarial Infidelity Learning for Model Interpretation

Jian Liang, Bing Bai, Yuren Cao, Kun Bai, Fei Wang

PDF

1 Repo

TL;DR

This paper introduces a model-agnostic framework called MEED that improves instance-wise feature selection for model interpretation by using adversarial infidelity learning and integrating prior interpretation methods, validated through extensive experiments.

Contribution

The paper proposes a novel MEED framework with AIL mechanism for more accurate, efficient, and robust model interpretation, addressing key challenges in feature importance explanation.

Findings

01

AIL enhances feature selection accuracy.

02

MEED outperforms existing interpretation methods.

03

Framework is validated by quantitative and human evaluations.

Abstract

Model interpretation is essential in data mining and knowledge discovery. It can help understand the intrinsic model working mechanism and check if the model has undesired characteristics. A popular way of performing model interpretation is Instance-wise Feature Selection (IFS), which provides an importance score of each feature representing the data samples to explain how the model generates the specific output. In this paper, we propose a Model-agnostic Effective Efficient Direct (MEED) IFS framework for model interpretation, mitigating concerns about sanity, combinatorial shortcuts, model identifiability, and information transmission. Also, we focus on the following setting: using selected features to directly predict the output of the given model, which serves as a primary evaluation metric for model-interpretation methods. Apart from the features, we involve the output of the given…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

langlrsw/MEED
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFeature Selection · Generative Adversarial Imitation Learning