Data Science Principles for Interpretable and Explainable AI

Kris Sankaran

arXiv:2405.10552·stat.ML·August 20, 2024·1 cites

Data Science Principles for Interpretable and Explainable AI

Kris Sankaran

PDF

Open Access 1 Repo

TL;DR

This paper reviews key principles and techniques for making AI models more interpretable and explainable, emphasizing transparency, user control, and evaluation criteria to address deployment risks.

Contribution

It synthesizes interpretability concepts, connects them to classical principles, illustrates basic techniques, and discusses evaluation and open challenges in explainable AI.

Findings

01

Explainability techniques like embeddings and integrated gradients are effective.

02

Audience goals are crucial in designing interpretability methods.

03

Evaluation criteria help assess interpretability approaches.

Abstract

Society's capacity for algorithmic problem-solving has never been greater. Artificial Intelligence is now applied across more domains than ever, a consequence of powerful abstractions, abundant data, and accessible software. As capabilities have expanded, so have risks, with models often deployed without fully understanding their potential impacts. Interpretable and interactive machine learning aims to make complex models more transparent and controllable, enhancing user agency. This review synthesizes key principles from the growing literature in this field. We first introduce precise vocabulary for discussing interpretability, like the distinction between glass box and explainable models. We then explore connections to classical statistical and design principles, like parsimony and the gulfs of interaction. Basic explainability techniques -- including learned embeddings, integrated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

krisrs1128/interpretability_review
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)