Distilling a Neural Network Into a Soft Decision Tree

Nicholas Frosst; Geoffrey Hinton

arXiv:1711.09784·cs.LG·November 28, 2017·266 cites

Distilling a Neural Network Into a Soft Decision Tree

Nicholas Frosst, Geoffrey Hinton

PDF

Open Access 5 Repos 1 Video

TL;DR

This paper presents a method to convert trained neural networks into soft decision trees, making model decisions more interpretable while maintaining or improving generalization performance.

Contribution

It introduces a novel approach to distill neural network knowledge into soft decision trees, enhancing interpretability without sacrificing accuracy.

Findings

01

Soft decision trees generalize better than those learned directly from data.

02

The method improves interpretability of neural network decisions.

03

Distilled trees retain high classification accuracy.

Abstract

Deep neural networks have proved to be a very effective way to perform classification tasks. They excel when the input data is high dimensional, the relationship between the input and the output is complicated, and the number of labeled training examples is large. But it is hard to explain why a learned network makes a particular classification decision on a particular test case. This is due to their reliance on distributed hierarchical representations. If we could take the knowledge acquired by the neural net and express the same knowledge in a model that relies on hierarchical decisions instead, explaining a particular decision would be much easier. We describe a way of using a trained neural net to create a type of soft decision tree that generalizes better than one learned directly from the training data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Distilling Neural Networks | Two Minute Papers #218· youtube

Taxonomy

TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Neural Networks and Applications