A Generic Approach for Reproducible Model Distillation
Yunzhe Zhou, Peiru Xu, Giles Hooker

TL;DR
This paper introduces a universal method for stable model distillation that ensures reproducibility across various interpretable models by using a statistical testing framework based on the central limit theorem.
Contribution
It develops a generic, statistically grounded approach for stable model distillation applicable to multiple model types, improving reproducibility and reliability.
Findings
Successfully applied to decision trees, rule lists, and symbolic regression.
Demonstrated effectiveness on Mammographic Mass and Breast Cancer datasets.
Provided theoretical analysis and publicly available code.
Abstract
Model distillation has been a popular method for producing interpretable machine learning. It uses an interpretable "student" model to mimic the predictions made by the black box "teacher" model. However, when the student model is sensitive to the variability of the data sets used for training even when keeping the teacher fixed, the corresponded interpretation is not reliable. Existing strategies stabilize model distillation by checking whether a large enough corpus of pseudo-data is generated to reliably reproduce student models, but methods to do so have so far been developed for a specific student model. In this paper, we develop a generic approach for stable model distillation based on central limit theorem for the average loss. We start with a collection of candidate student models and search for candidates that reasonably agree with the teacher. Then we construct a multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Topic Modeling
