A Generic Approach for Reproducible Model Distillation

Yunzhe Zhou; Peiru Xu; Giles Hooker

arXiv:2211.12631·stat.ML·May 1, 2023

A Generic Approach for Reproducible Model Distillation

Yunzhe Zhou, Peiru Xu, Giles Hooker

PDF

Open Access 1 Repo

TL;DR

This paper introduces a universal method for stable model distillation that ensures reproducibility across various interpretable models by using a statistical testing framework based on the central limit theorem.

Contribution

It develops a generic, statistically grounded approach for stable model distillation applicable to multiple model types, improving reproducibility and reliability.

Findings

01

Successfully applied to decision trees, rule lists, and symbolic regression.

02

Demonstrated effectiveness on Mammographic Mass and Breast Cancer datasets.

03

Provided theoretical analysis and publicly available code.

Abstract

Model distillation has been a popular method for producing interpretable machine learning. It uses an interpretable "student" model to mimic the predictions made by the black box "teacher" model. However, when the student model is sensitive to the variability of the data sets used for training even when keeping the teacher fixed, the corresponded interpretation is not reliable. Existing strategies stabilize model distillation by checking whether a large enough corpus of pseudo-data is generated to reliably reproduce student models, but methods to do so have so far been developed for a specific student model. In this paper, we develop a generic approach for stable model distillation based on central limit theorem for the average loss. We start with a collection of candidate student models and search for candidates that reasonably agree with the teacher. Then we construct a multiple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yunzhe-zhou/GenericDistillation
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Topic Modeling