Computing the Collection of Good Models for Rule Lists

Kota Mata; Kentaro Kanamori; Hiroki Arimura

arXiv:2204.11285·cs.LG·April 26, 2022

Computing the Collection of Good Models for Rule Lists

Kota Mata, Kentaro Kanamori, Hiroki Arimura

PDF

Open Access

TL;DR

This paper introduces an efficient algorithm, CorelsEnum, for exactly enumerating all good rule list models within a dataset, enabling comprehensive analysis of model diversity and fairness in interpretable AI.

Contribution

It presents CorelsEnum, a polynomial-space enumeration algorithm for all good rule list models, improving over previous approximate or incomplete methods.

Findings

01

Successfully enumerated tens of thousands of models in seconds

02

Revealed large diversity in model predictions and fairness

03

Demonstrated the method's efficiency over existing top-K approaches

Abstract

Since the seminal paper by Breiman in 2001, who pointed out a potential harm of prediction multiplicities from the view of explainable AI, global analysis of a collection of all good models, also known as a `Rashomon set,' has been attracted much attention for the last years. Since finding such a set of good models is a hard computational problem, there have been only a few algorithms for the problem so far, most of which are either approximate or incomplete. To overcome this difficulty, we study efficient enumeration of all good models for a subclass of interpretable models, called rule lists. Based on a state-of-the-art optimal rule list learner, CORELS, proposed by Angelino et al. in 2017, we present an efficient enumeration algorithm CorelsEnum for exactly computing a set of all good models using polynomial space in input size, given a dataset and a error tolerance from an optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Bayesian Modeling and Causal Inference