Distilling Interpretable Models into Human-Readable Code

Walker Ravina; Ethan Sterling; Olexiy Oryeshko; Nathan Bell; Honglei; Zhuang; Xuanhui Wang; Yonghui Wu; Alexander Grushetsky

arXiv:2101.08393·cs.LG·February 10, 2021

Distilling Interpretable Models into Human-Readable Code

Walker Ravina, Ethan Sterling, Olexiy Oryeshko, Nathan Bell, Honglei, Zhuang, Xuanhui Wang, Yonghui Wu, Alexander Grushetsky

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to convert complex models into simple, human-readable code by distilling their knowledge into piecewise-linear functions, enhancing interpretability and manual editability.

Contribution

It proposes a novel distillation approach that approximates models with concise, interpretable piecewise-linear functions, enabling easier review and manual modification.

Findings

01

Effective across classification, regression, and ranking tasks.

02

Produces accurate, concise, and human-readable models.

03

Demonstrates broad applicability with four datasets.

Abstract

The goal of model distillation is to faithfully transfer teacher model knowledge to a model which is faster, more generalizable, more interpretable, or possesses other desirable characteristics. Human-readability is an important and desirable standard for machine-learned model interpretability. Readable models are transparent and can be reviewed, manipulated, and deployed like traditional source code. As a result, such models can be improved outside the context of machine learning and manually edited if desired. Given that directly training such models is difficult, we propose to train interpretable models using conventional methods, and then distill them into concise, human-readable code. The proposed distillation methodology approximates a model's univariate numerical functions with piecewise-linear curves in a localized manner. The resulting curve model representations are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google/pwlfit
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Imbalanced Data Classification Techniques