Learning Accurate and Interpretable Decision Rule Sets from Neural Networks
Litao Qiao, Weijia Wang, Bill Lin

TL;DR
This paper introduces a neural network-based method for learning accurate, interpretable decision rule sets in disjunctive normal form, balancing accuracy and simplicity, and outperforming existing rule-learning algorithms.
Contribution
It presents a novel neural network architecture that directly maps neurons to interpretable rules, enabling effective rule learning with regularization for simplicity.
Findings
Outperforms state-of-the-art rule-learning algorithms in accuracy.
Achieves better accuracy-simplicity trade-offs.
Produces interpretable rules with comparable performance to black-box models.
Abstract
This paper proposes a new paradigm for learning a set of independent logical rules in disjunctive normal form as an interpretable model for classification. We consider the problem of learning an interpretable decision rule set as training a neural network in a specific, yet very simple two-layer architecture. Each neuron in the first layer directly maps to an interpretable if-then rule after training, and the output neuron in the second layer directly maps to a disjunction of the first-layer rules to form the decision rule set. Our representation of neurons in this first rules layer enables us to encode both the positive and the negative association of features in a decision rule. State-of-the-art neural net training approaches can be leveraged for learning highly accurate classification models. Moreover, we propose a sparsity-based regularization approach to balance between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Neural Networks and Applications
