LoRMIkA: Local rule-based model interpretability with k-optimal   associations

Dilini Rajapaksha; Christoph Bergmeir; Wray Buntine

arXiv:1908.03840·cs.LG·June 21, 2021

LoRMIkA: Local rule-based model interpretability with k-optimal associations

Dilini Rajapaksha, Christoph Bergmeir, Wray Buntine

PDF

TL;DR

LoRMIkA is a novel, model-agnostic method that extracts k-optimal association rules for local interpretability of machine learning predictions, offering flexible, predictive, and counterfactual explanations.

Contribution

It introduces a new flexible framework for local rule-based explanations that finds k-optimal rules considering multiple interestingness objectives, improving interpretability.

Findings

01

Achieves competitive local accuracy and interpretability on three datasets.

02

Provides multiple explanations including counterfactual rules.

03

Outperforms or matches state-of-the-art local interpretability methods.

Abstract

As we rely more and more on machine learning models for real-life decision-making, being able to understand and trust the predictions becomes ever more important. Local explainer models have recently been introduced to explain the predictions of complex machine learning models at the instance level. In this paper, we propose Local Rule-based Model Interpretability with k-optimal Associations (LoRMIkA), a novel model-agnostic approach that obtains k-optimal association rules from a neighbourhood of the instance to be explained. Compared with other rule-based approaches in the literature, we argue that the most predictive rules are not necessarily the rules that provide the best explanations. Consequently, the LoRMIkA framework provides a flexible way to obtain predictive and interesting rules. It uses an efficient search algorithm guaranteed to find the k-optimal rules with respect to…

Tables6

Table 1. Table 1: Coverage level of explainers

Dataset	COMPAS			Adult			German			Covertype
Black-box	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA
SVM	0.24 $\begin{matrix} + \\ - \end{matrix}$ 0.10	0.50 $\begin{matrix} + \\ - \end{matrix}$ 0.12	0.65 $\begin{matrix} + \\ - \end{matrix}$ 0.13	0.09 $\begin{matrix} + \\ - \end{matrix}$ 0.09	0.50 $\begin{matrix} + \\ - \end{matrix}$ 0.10	0.73 $\begin{matrix} + \\ - \end{matrix}$ 0.12	0.27 $\begin{matrix} + \\ - \end{matrix}$ 0.13	0.55 $\begin{matrix} + \\ - \end{matrix}$ 0.09	0.37 $\begin{matrix} + \\ - \end{matrix}$ 0.03	-	-	-
DT	0.05 $\begin{matrix} + \\ - \end{matrix}$ 0.05	0.51 $\begin{matrix} + \\ - \end{matrix}$ 0.13	0.79 $\begin{matrix} + \\ - \end{matrix}$ 0.12	0.07 $\begin{matrix} + \\ - \end{matrix}$ 0.09	0.49 $\begin{matrix} + \\ - \end{matrix}$ 0.11	0.83 $\begin{matrix} + \\ - \end{matrix}$ 0.05	0.03 $\begin{matrix} + \\ - \end{matrix}$ 0.03	0.50 $\begin{matrix} + \\ - \end{matrix}$ 0.12	0.60 $\begin{matrix} + \\ - \end{matrix}$ 0.15	0.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.30 $\begin{matrix} + \\ - \end{matrix}$ 0.18	0.92 $\begin{matrix} + \\ - \end{matrix}$ 0.14
LR	0.18 $\begin{matrix} + \\ - \end{matrix}$ 0.13	0.52 $\begin{matrix} + \\ - \end{matrix}$ 0.14	0.88 $\begin{matrix} + \\ - \end{matrix}$ 0.05	0.10 $\begin{matrix} + \\ - \end{matrix}$ 0.10	0.49 $\begin{matrix} + \\ - \end{matrix}$ 0.11	0.88 $\begin{matrix} + \\ - \end{matrix}$ 0.03	0.06 $\begin{matrix} + \\ - \end{matrix}$ 0.07	0.51 $\begin{matrix} + \\ - \end{matrix}$ 0.10	0.77 $\begin{matrix} + \\ - \end{matrix}$ 0.14	0.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.41 $\begin{matrix} + \\ - \end{matrix}$ 0.04	0.96 $\begin{matrix} + \\ - \end{matrix}$ 0.09
NN	0.09 $\begin{matrix} + \\ - \end{matrix}$ 0.08	0.50 $\begin{matrix} + \\ - \end{matrix}$ 0.13	0.84 $\begin{matrix} + \\ - \end{matrix}$ 0.08	0.05 $\begin{matrix} + \\ - \end{matrix}$ 0.06	0.49 $\begin{matrix} + \\ - \end{matrix}$ 0.10	0.86 $\begin{matrix} + \\ - \end{matrix}$ 0.03	0.03 $\begin{matrix} + \\ - \end{matrix}$ 0.04	0.51 $\begin{matrix} + \\ - \end{matrix}$ 0.10	0.68 $\begin{matrix} + \\ - \end{matrix}$ 0.17	0.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.67 $\begin{matrix} + \\ - \end{matrix}$ 0.41	0.96 $\begin{matrix} + \\ - \end{matrix}$ 0.09
RF	0.08 $\begin{matrix} + \\ - \end{matrix}$ 0.07	0.49 $\begin{matrix} + \\ - \end{matrix}$ 0.11	0.82 $\begin{matrix} + \\ - \end{matrix}$ 0.11	0.09 $\begin{matrix} + \\ - \end{matrix}$ 0.11	0.49 $\begin{matrix} + \\ - \end{matrix}$ 0.10	0.83 $\begin{matrix} + \\ - \end{matrix}$ 0.05	0.19 $\begin{matrix} + \\ - \end{matrix}$ 0.17	0.51 $\begin{matrix} + \\ - \end{matrix}$ 0.10	0.70 $\begin{matrix} + \\ - \end{matrix}$ 0.08	0.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.36 $\begin{matrix} + \\ - \end{matrix}$ 0.13	0.97 $\begin{matrix} + \\ - \end{matrix}$ 0.09

Table 2. Table 2: Confidence of explainers

Dataset	COMPAS			Adult			German			Covertype
Black-box	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA
SVM	0.99 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.14	0.99 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.05	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.99 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.99 $\begin{matrix} + \\ - \end{matrix}$ 0.00	-	-	-
DT	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.97 $\begin{matrix} + \\ - \end{matrix}$ 0.08	0.91 $\begin{matrix} + \\ - \end{matrix}$ 0.03	0.97 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.05	0.96 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.06	0.84 $\begin{matrix} + \\ - \end{matrix}$ 0.04	0.88 $\begin{matrix} + \\ - \end{matrix}$ 0.08	0.68 $\begin{matrix} + \\ - \end{matrix}$ 0.41	0.80 $\begin{matrix} + \\ - \end{matrix}$ 0.05
LR	0.97 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.97 $\begin{matrix} + \\ - \end{matrix}$ 0.09	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.97 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.04	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.06	0.99 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.97 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.93 $\begin{matrix} + \\ - \end{matrix}$ 0.13	0.97 $\begin{matrix} + \\ - \end{matrix}$ 0.01
NN	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.97 $\begin{matrix} + \\ - \end{matrix}$ 0.08	0.99 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.97 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.05	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.06	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.96 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.67 $\begin{matrix} + \\ - \end{matrix}$ 0.40	0.88 $\begin{matrix} + \\ - \end{matrix}$ 0.03
RF	0.97 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.05	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.97 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.05	0.99 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.97 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.05	0.97 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.93 $\begin{matrix} + \\ - \end{matrix}$ 0.06	0.80 $\begin{matrix} + \\ - \end{matrix}$ 0.30	0.95 $\begin{matrix} + \\ - \end{matrix}$ 0.02

Table 3. Table 3: Rate of Interestingness

Dataset	COMPAS			Adult			German			Covertype
Black-box	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA
SVM	0.01 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.83 $\begin{matrix} + \\ - \end{matrix}$ 0.32	3.11 $\begin{matrix} + \\ - \end{matrix}$ 0.18	0.02 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.84 $\begin{matrix} + \\ - \end{matrix}$ 0.29	2.61 $\begin{matrix} + \\ - \end{matrix}$ 0.94	0.02 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.90 $\begin{matrix} + \\ - \end{matrix}$ 0.09	1.90 $\begin{matrix} + \\ - \end{matrix}$ 0.17	-	-	-
DT	0.02 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.82 $\begin{matrix} + \\ - \end{matrix}$ 0.34	1.25 $\begin{matrix} + \\ - \end{matrix}$ 0.17	0.03 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.81 $\begin{matrix} + \\ - \end{matrix}$ 0.34	1.85 $\begin{matrix} + \\ - \end{matrix}$ 0.61	0.02 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.82 $\begin{matrix} + \\ - \end{matrix}$ 0.35	0.99 $\begin{matrix} + \\ - \end{matrix}$ 0.19	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.50 $\begin{matrix} + \\ - \end{matrix}$ 0.84	1.58 $\begin{matrix} + \\ - \end{matrix}$ 0.20
LR	0.03 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.81 $\begin{matrix} + \\ - \end{matrix}$ 0.33	3.33 $\begin{matrix} + \\ - \end{matrix}$ 0.49	0.03 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.82 $\begin{matrix} + \\ - \end{matrix}$ 0.34	5.14 $\begin{matrix} + \\ - \end{matrix}$ 0.98	0.02 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.83 $\begin{matrix} + \\ - \end{matrix}$ 0.30	3.16 $\begin{matrix} + \\ - \end{matrix}$ 0.58	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.98 $\begin{matrix} + \\ - \end{matrix}$ 0.02	2.05 $\begin{matrix} + \\ - \end{matrix}$ 0.12
NN	0.02 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.82 $\begin{matrix} + \\ - \end{matrix}$ 0.33	2.87 $\begin{matrix} + \\ - \end{matrix}$ 0.44	0.03 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.83 $\begin{matrix} + \\ - \end{matrix}$ 0.33	5.22 $\begin{matrix} + \\ - \end{matrix}$ 1.24	0.02 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.88 $\begin{matrix} + \\ - \end{matrix}$ 0.29	2.43 $\begin{matrix} + \\ - \end{matrix}$ 0.29	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.48 $\begin{matrix} + \\ - \end{matrix}$ 0.84	1.90 $\begin{matrix} + \\ - \end{matrix}$ 0.39
RF	0.03 $\begin{matrix} + \\ - \end{matrix}$ 0.02	0.81 $\begin{matrix} + \\ - \end{matrix}$ 0.34	2.05 $\begin{matrix} + \\ - \end{matrix}$ 0.31	0.03 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.82 $\begin{matrix} + \\ - \end{matrix}$ 0.35	3.19 $\begin{matrix} + \\ - \end{matrix}$ 1.20	0.03 $\begin{matrix} + \\ - \end{matrix}$ 0.01	0.83 $\begin{matrix} + \\ - \end{matrix}$ 0.33	2.17 $\begin{matrix} + \\ - \end{matrix}$ 0.66	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.78 $\begin{matrix} + \\ - \end{matrix}$ 0.59	2.02 $\begin{matrix} + \\ - \end{matrix}$ 0.40

Table 4. Table 4: Run-time of explainers

Covertype Dataset
Black-box	Anchor	LORE	LoRMIkA
DT	41.28s	22.60s	38.31s
LR	2.19s	7.42s	38.15s
NN	20.48s	20.13s	38.29s
RF	24.81s	289.41s	38.27s

Table 5. Table 5: Jaccard measure of stability in explainers

Dataset	COMPAS			Adult			German			Covertype
Black-box	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA
SVM	0.85 $\begin{matrix} + \\ - \end{matrix}$ 0.18	0.50 $\begin{matrix} + \\ - \end{matrix}$ 0.12	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.73 $\begin{matrix} + \\ - \end{matrix}$ 0.25	0.91 $\begin{matrix} + \\ - \end{matrix}$ 0.11	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.74 $\begin{matrix} + \\ - \end{matrix}$ 0.26	0.93 $\begin{matrix} + \\ - \end{matrix}$ 0.08	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.27 $\begin{matrix} + \\ - \end{matrix}$ 0.13	0.55 $\begin{matrix} + \\ - \end{matrix}$ 0.09	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00
DT	0.78 $\begin{matrix} + \\ - \end{matrix}$ 0.19	0.74 $\begin{matrix} + \\ - \end{matrix}$ 0.26	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.75 $\begin{matrix} + \\ - \end{matrix}$ 0.20	0.79 $\begin{matrix} + \\ - \end{matrix}$ 0.24	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.74 $\begin{matrix} + \\ - \end{matrix}$ 0.69	0.79 $\begin{matrix} + \\ - \end{matrix}$ 0.24	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.63 $\begin{matrix} + \\ - \end{matrix}$ 0.22	0.74 $\begin{matrix} + \\ - \end{matrix}$ 0.21	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00
LR	0.80 $\begin{matrix} + \\ - \end{matrix}$ 0.18	0.52 $\begin{matrix} + \\ - \end{matrix}$ 0.14	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.73 $\begin{matrix} + \\ - \end{matrix}$ 0.26	0.82 $\begin{matrix} + \\ - \end{matrix}$ 0.20	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.74 $\begin{matrix} + \\ - \end{matrix}$ 0.26	0.82 $\begin{matrix} + \\ - \end{matrix}$ 0.21	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.84 $\begin{matrix} + \\ - \end{matrix}$ 0.16	0.76 $\begin{matrix} + \\ - \end{matrix}$ 0.21	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00
NN	0.74 $\begin{matrix} + \\ - \end{matrix}$ 0.24	0.83 $\begin{matrix} + \\ - \end{matrix}$ 0.19	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.74 $\begin{matrix} + \\ - \end{matrix}$ 0.22	0.78 $\begin{matrix} + \\ - \end{matrix}$ 0.25	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.68 $\begin{matrix} + \\ - \end{matrix}$ 0.26	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.66 $\begin{matrix} + \\ - \end{matrix}$ 0.23	0.76 $\begin{matrix} + \\ - \end{matrix}$ 0.19	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00
RF	0.79 $\begin{matrix} + \\ - \end{matrix}$ 0.23	0.49 $\begin{matrix} + \\ - \end{matrix}$ 0.11	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.74 $\begin{matrix} + \\ - \end{matrix}$ 0.23	0.82 $\begin{matrix} + \\ - \end{matrix}$ 0.20	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.75 $\begin{matrix} + \\ - \end{matrix}$ 0.22	0.75 $\begin{matrix} + \\ - \end{matrix}$ 0.20	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	0.65 $\begin{matrix} + \\ - \end{matrix}$ 0.21	0.67 $\begin{matrix} + \\ - \end{matrix}$ 0.26	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00

Table 6. Table 6: Number of features in the explanation

Dataset	COMPAS			Adult			German			Covertype
Black-box	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA	Anchor	LORE	LoRMIkA
SVM	1.71 $\begin{matrix} + \\ - \end{matrix}$ 0.25	1.36 $\begin{matrix} + \\ - \end{matrix}$ 0.29	1.03 $\begin{matrix} + \\ - \end{matrix}$ 0.19	2.61 $\begin{matrix} + \\ - \end{matrix}$ 0.31	1.24 $\begin{matrix} + \\ - \end{matrix}$ 0.25	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	2.87 $\begin{matrix} + \\ - \end{matrix}$ 0.37	1.10 $\begin{matrix} + \\ - \end{matrix}$ 0.10	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	-	-	-
DT	4.77 $\begin{matrix} + \\ - \end{matrix}$ 0.47	1.84 $\begin{matrix} + \\ - \end{matrix}$ 0.59	1.25 $\begin{matrix} + \\ - \end{matrix}$ 0.44	4.23 $\begin{matrix} + \\ - \end{matrix}$ 0.58	2.30 $\begin{matrix} + \\ - \end{matrix}$ 0.61	1.26 $\begin{matrix} + \\ - \end{matrix}$ 0.45	4.27 $\begin{matrix} + \\ - \end{matrix}$ 1.05	2.44 $\begin{matrix} + \\ - \end{matrix}$ 0.90	1.21 $\begin{matrix} + \\ - \end{matrix}$ 0.42	28.50 $\begin{matrix} + \\ - \end{matrix}$ 8.28	4.77 $\begin{matrix} + \\ - \end{matrix}$ 0.77	2.25 $\begin{matrix} + \\ - \end{matrix}$ 1.29
LR	2.08 $\begin{matrix} + \\ - \end{matrix}$ 0.31	1.28 $\begin{matrix} + \\ - \end{matrix}$ 0.30	1.40 $\begin{matrix} + \\ - \end{matrix}$ 0.50	2.73 $\begin{matrix} + \\ - \end{matrix}$ 0.52	1.90 $\begin{matrix} + \\ - \end{matrix}$ 0.47	1.41 $\begin{matrix} + \\ - \end{matrix}$ 0.50	2.80 $\begin{matrix} + \\ - \end{matrix}$ 0.45	1.85 $\begin{matrix} + \\ - \end{matrix}$ 0.46	1.20 $\begin{matrix} + \\ - \end{matrix}$ 0.42	3.21 $\begin{matrix} + \\ - \end{matrix}$ 0.64	4.11 $\begin{matrix} + \\ - \end{matrix}$ 0.74	6.46 $\begin{matrix} + \\ - \end{matrix}$ 3.59
NN	3.99 $\begin{matrix} + \\ - \end{matrix}$ 0.47	1.32 $\begin{matrix} + \\ - \end{matrix}$ 0.29	1.38 $\begin{matrix} + \\ - \end{matrix}$ 0.53	3.70 $\begin{matrix} + \\ - \end{matrix}$ 0.39	1.98 $\begin{matrix} + \\ - \end{matrix}$ 0.49	1.46 $\begin{matrix} + \\ - \end{matrix}$ 0.57	4.03 $\begin{matrix} + \\ - \end{matrix}$ 0.67	1.00 $\begin{matrix} + \\ - \end{matrix}$ 0.00	1.27 $\begin{matrix} + \\ - \end{matrix}$ 0.45	10.00 $\begin{matrix} + \\ - \end{matrix}$ 3.68	6.13 $\begin{matrix} + \\ - \end{matrix}$ 0.96	5.48 $\begin{matrix} + \\ - \end{matrix}$ 3.78
RF	4.20 $\begin{matrix} + \\ - \end{matrix}$ 0.34	1.61 $\begin{matrix} + \\ - \end{matrix}$ 0.43	1.37 $\begin{matrix} + \\ - \end{matrix}$ 0.50	3.46 $\begin{matrix} + \\ - \end{matrix}$ 0.45	2.04 $\begin{matrix} + \\ - \end{matrix}$ 0.47	1.28 $\begin{matrix} + \\ - \end{matrix}$ 0.44	3.10 $\begin{matrix} + \\ - \end{matrix}$ 0.60	2.50 $\begin{matrix} + \\ - \end{matrix}$ 0.91	1.07 $\begin{matrix} + \\ - \end{matrix}$ 0.28	15.39 $\begin{matrix} + \\ - \end{matrix}$ 4.36	3.60 $\begin{matrix} + \\ - \end{matrix}$ 0.80	8.77 $\begin{matrix} + \\ - \end{matrix}$ 7.46

Equations38

Support (p) = \frac{∣ instance \in Dataset, such that instance fulfills p ∣}{Total number of instances in the dataset}

Support (p) = \frac{∣ instance \in Dataset, such that instance fulfills p ∣}{Total number of instances in the dataset}

Support (p \to q) = Support (p \land q)

Support (p \to q) = Support (p \land q)

Coverage (p \to q) = Support (p)

Coverage (p \to q) = Support (p)

Confidence (p \to q) = \frac{Support ( p \to q )}{Support ( p )}

Confidence (p \to q) = \frac{Support ( p \to q )}{Support ( p )}

Lift (p \to q) = \frac{Support ( p \to q )}{Support ( p ) \times Support ( q )}

Lift (p \to q) = \frac{Support ( p \to q )}{Support ( p ) \times Support ( q )}

Leverage (p \to q) = Support (p \to q) - (Support (p) \times Support (q))

Leverage (p \to q) = Support (p \to q) - (Support (p) \times Support (q))

K(x,x^{\prime})=\exp\bigg{(}-\dfrac{{x-x^{\prime}}^{2}}{2w^{2}}\bigg{)}.

K(x,x^{\prime})=\exp\bigg{(}-\dfrac{{x-x^{\prime}}^{2}}{2w^{2}}\bigg{)}.

I_{crossover} = x + (y - x) * α

I_{crossover} = x + (y - x) * α

I_{mutation} = x + (y - z) * σ

I_{mutation} = x + (y - z) * σ

ℜ_{c α_LoRMIkA}^{+}

ℜ_{c α_LoRMIkA}^{+}

i s_v i o l e n t_r ec i d = 1) t a r g e t \Rightarrow H i g h}

ℜ_{c α_Anchor}^{+}

ℜ_{c α_Anchor}^{+}

i s_r ec i d = 1 & c_c ha r g e_d e g r ee = F & se x = M a l e &

d a y s_b_scr ee nin g_a r r es t \leq 1.00 & 25 < a g e \leq 31 & 2 < p r i or s_co u n t \leq 5 &

i s_v i o l e n t_r ec i d = 1) t a r g e t \Rightarrow H i g h}

ℜ_{c α_LORE}^{+} = {(a g e \leq 27) t a r g e t \Rightarrow H i g h}

ℜ_{c α_LORE}^{+} = {(a g e \leq 27) t a r g e t \Rightarrow H i g h}

ℜ_{c β_LoRMIkA}^{-}

ℜ_{c β_LoRMIkA}^{-}

ℜ_{h α_LoRMIkA}^{+}

ℜ_{h α_LoRMIkA}^{+}

ℜ_{h β_LoRMIkA}^{-}

ℜ_{h β_LoRMIkA}^{-}

ℜ_{h β_LORE}^{-}

ℜ_{h β_LORE}^{-}

J (X, Y) = ∣ X \cap Y ∣/∣ X \cup Y ∣

J (X, Y) = ∣ X \cap Y ∣/∣ X \cup Y ∣

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsInterpretability

Full text

LoRMIkA: Local Rule-based Model Interpretability with k-optimal Associations

Dilini Rajapaksha

Christoph Bergmeir

Wray Buntine

[email protected], [email protected], [email protected]

Faculty of Information Technology, Monash University, Melbourne, Australia.

Abstract

As we rely more and more on machine learning models for real-life decision-making, being able to understand and trust the predictions becomes ever more important. Local explainer models have recently been introduced to explain the predictions of complex machine learning models at the instance level. In this paper, we propose Local Rule-based Model Interpretability with k-optimal Associations (LoRMIkA), a novel model-agnostic approach that obtains k-optimal association rules from a neighbourhood of the instance to be explained. Compared with other rule-based approaches in the literature, we argue that the most predictive rules are not necessarily the rules that provide the best explanations. Consequently, the LoRMIkA framework provides a flexible way to obtain predictive and interesting rules. It uses an efficient search algorithm guaranteed to find the k-optimal rules with respect to objectives such as confidence, lift, leverage, coverage, and support. It also provides multiple rules which explain the decision and counterfactual rules, which give indications for potential changes to obtain different outputs for given instances. We compare our approach to other state-of-the-art approaches in local model interpretability on three different datasets and achieve competitive results in terms of local accuracy and interpretability.

keywords:

interpretability, local-interpretability, k-optimal , class-association-rules

MSC:

[2010] 00-01, 99-00

1 Introduction

Explainability of machine learning models is becoming ever more important [23]. For example, the European General Data Protection Regulation from 2018 contains a right to explanation concept for any decision provided by predictive models. Traditionally, explainability can be achieved by machine learning models that are considered interpretable, such as classic linear regression, logistic regression, or decision trees [33]. Also newer developments in this area of globally interpretable models are present, e.g. Proença and van Leeuwen [30] introduced an approach to formalize multi-class classification problems based on a probabilistic rule list and the minimum description length principle. However, recently local model-agnostic interpretability methods have been introduced that offer explanations for predictions of any (black-box) model. These methods typically use simpler models fitted locally in the neighbourhood of an instance to the output of a more complex black-box model, assuming that the behaviour of the instance to be explained is similar to the behaviour of its neighbourhood. In this way, they are able to provide local explanations for a particular instance that can be more accurate and relevant for the instance under consideration as opposed to global explanations [6, 3] that do not change across the dataset.

The first work in this line that paved the way for others was the work of Ribeiro et al. [34] that introduced Local Interpretable Model-agnostic Explanations (LIME). Here, instances are randomly perturbed around the instance to be explained and a linear model is fitted to the data in the selected neighbourhood of the considered instance. However, LIME has various limitations. For instance, it does not perform well when the features have a higher degree of interaction and non-linear relationships with the target variable. Consequently, Lundberg and Lee [24] introduced SHapley Additive exPlanation (SHAP) that overcomes many of the shortcomings of LIME by providing consistent additive explanations based on game theory. Both LIME and SHAP are feature-importance based explanations that show the features which have the most positive or negative impact on the prediction of the global model for a particular instance.

In recent years, deep learning methods have proven extremely successful, in particular in application areas where there are many highly dependent features like images, text, or speech. Here, considering the importance of features, i.e., pixels of an image or the words in a sentence, can often not be considered interpretable. Researchers in these areas are addressing this issue mainly in two different ways. The first way is to use sample-based explanation methods that explain how the model parameters are actually derived based on the training data. For example, Koh and Liang [19] considered influence functions to capture the core idea of the model in consideration. Yeh et al. [42] proposed a method to provide explanations for the prediction of a deep neural network for a given test instance, based on the representer theorem, which uses so-called representer values that measure the weighted importance of each training instance towards the learned parameters of the model. In their work, those authors discussed that the pre-activation prediction values can be decomposed into a linear combination of training point activations, with the weights corresponding to the representer values. The representer instances can be selected based on the magnitude of the representer values, which enables the understanding of the model prediction of the decomposition by identifying its representer instances in the training set. Khanna et al. [18] use Sequential Bayesian Quadrature (SBQ) for efficient selection of instances and for feature embedding for each data point using Fisher kernels. Arik and Pfister [2] apply an attention-based mechanism to identify the training set instances that relate to the prediction of the given test instance. The second main avenue for explainers in deep learning are concept-based explanations. For example, Been et al. [5] proposed Concept Activation Vectors (CAV) to interpret the internal state of the neural networks. The CAVs in this approach are considered to be human-friendly explanatory concepts. Further, Zhang et al. [46] considered multiple Gaussian models to represent the distribution of data where each Gaussian model reflects some local characteristics related to the dataset.

However, especially outside of the field of deep learning, recently many researchers have shifted from linear models as explainers to rule-based explanations [30, 28], as they arguably provide more precise explanations to the end users [35] and are more interpretable [21] compared with others. Puri et al. [31] introduced a global rule-based explainer. Lakkaraju et al. [22] proposed a model-agnostic framework called Model Understanding through Subspace Explanations (MUSE), which explains the global model predictions by considering the different subgroups of the instances which are characterized by features of user interest. Ribeiro et al. [35] proposed Anchor-LIME, which is a local-explainable model. It selects a neighbourhood with the help of a bandit algorithm [17] and extracts rules afterwards from the neighbourhood while finding subsets of features that retain the same prediction when held constant, even though all other features are changed. The rules obtained from that algorithm are called anchors by those authors. A limitation of Anchor-LIME is that it does not provide counterfactual rules which are often important to invert a decision. LOcal Rule-based Explanations (LORE) was introduced by Guidotti et al. [13]. In this algorithm a decision tree is built in the neighbourhood of an instance to be explained, to acquire a single rule for the decision and a set of counterfactual rules for the inverse decision.

In this paper, we propose a flexible framework for Local Rule-based Model Interpretability with k-optimal Associations (LoRMIkA) to explain the predictions produced for tabular classification datasets. Here, k is a user-defined parameter of the maximum number of (non-redundant) class association rules to be extracted.

In accordance with other local-interpretability frameworks [34, 35, 13], the key assumption of LoRMIkA is that the behaviour of the instance to be explained is similar to the behaviour of the instances in its neighbourhood. To explain the instances in the neighbourhood, we first select the most similar instances to the instance to be explained from the training data set, and we generate synthetic instances within the neighbourhood. After, we generate the predictions for all the instances (i.e., both selected instances and generated instances) in the neighbourhood using the global machine learning model to identify the logics and the behaviour of the global model when providing predictions. Here, we assume that the logic used to generate these predictions is similar to the prediction of the instance to be explained. Then we mine a total number of “k” optimal class association rules [41] between all the instances in the neighbourhood and their predictions using the OPUS search algorithm [38] with respect to measures popular in the association rule mining community as described in Section 2.

LoRMIkA leverages decades of research in association rule mining, and is therewith able to overcome a number of drawbacks of the existing algorithms. We argue that class association rules are better suited to provide explanations than, for instance, linear models or decision trees, as they are local models both with respect to features and instances. Furthermore, we consider one of the main limitations of current state-of-the-art rule-based local model-agnostic explainers to be that these algorithms consider only the most predictive rules. We argue that the most predictive rules are not necessarily the rules that explain the best. There can be interesting rules which are very useful explainers though they are not highly predictive. The state-of-the-art algorithms do not consider such rules as the explanations of a prediction. We provide a detailed explanation of rules that predict vs rules that explain in Section 2.2.3. LoRMIkA uses an efficient search algorithm to find the k-optimal rules from a neighbourhood and is therewith able to produce model-agnostic explanations for a given black-box algorithm. LoRMIkA tackles redundancy of features in rules and is able to generate simple rules. Furthermore, we are able to search for the “k” number of best rules with respect to confidence, lift, leverage, coverage and support, which are the measures developed by the association rule mining community [12] to measure if rules are predictive and/or interesting.

The state-of-the-art algorithms currently produce two types of rules which 1) explain the prediction towards the decision and 2) explain with counterfactual rules how to reverse the current decision of the global model. LoRMIkA generates two additional types of rules (four types of rules in total) that provide additional information and explanations for the prediction provided by the global model. Section 3.7 explains the different types of rules in detail and illustrates and demonstrates how they can be used for explanations.

We have tested and compared LoRMIkA with other state-of-the-art approaches on four different tabular classification datasets and conducted both qualitative and quantitative experiments to assess both interpretability and local accuracy, with competitive results.

In summary, the contributions of this paper are as follows.

We propose a novel approach based on class association rules to generate local rule-based explanations for the predictions of black-box machine learning models for classification problems.

2.

Our approach produces four types of rules which explain the prediction, namely: conditions that positively support the decision of the prediction, conditions that negatively support the decision prediction, conditions to increase the probability towards the decision, conditions that potentially reverse the decision.

3.

In our approach, rules can be searched for different optimization goals (i.e., coverage, lift, confidence etc.) to generate both predictive and/or interesting rules based on the user preference.

The remainder of this paper is structured as follows. Section 2 revisits the most important results from the association rule mining community with respect to local explainers. Section 3 presents our approach. Section 4 discusses the experimental setup and results, and Section 5 concludes the paper.

2 Association Rule Mining

In this section, we revisit the major relevant findings in association rule mining research with respect to association rules being local models, predictive versus explanatory rules, and efficient ways for finding rules.

2.1 Notations and Definitions

Association rule mining is a rule-based machine learning approach to find interesting combinations of input and output variables of frequent patterns, correlations, associations, or causal structures in large databases. It was firstly introduced by Agrawal et al. [1]. In the following, we define the concepts used in our work.

Definition 1.

Class Association Rule: A class association rule is a rule of the form $r=p\rightarrow q$ . The antecedent or left-hand side (LHS) of a rule, $p$ , is a boolean condition on feature values. The consequent or right-hand side (RHS) of a rule, $q$ , is the target class of the decision variable.

Definition 2.

Support: The proportion of instances that match p from the total amount of instances in the dataset.

[TABLE]

Definition 3.

Coverage: The support of the antecedent of an association rule.

[TABLE]

Definition 4.

Confidence (i.e., Strength or Precision): The percentage of instances in the dataset which contain the consequent and antecedent together over the number of instances which only contain the antecedent.

[TABLE]

Definition 5.

Lift: Measures how often feature values in $q$ appear in instances that contain $p$ , while controlling for the frequency for target value $q$ . This is a measure for how surprising or interesting a rule is, i.e., how much it differs from a random association, in the sense that it represents a non-trivial correlation between antecedent and consequent [39]. A lift equal to one implies that no associations can be found. A lift between zero and one means a negative association and a lift greater one means positive association.

[TABLE]

Definition 6.

Leverage: Calculates the observed frequency of $p$ and $q$ appearing together minus the frequency that would be expected if $p$ and $q$ were statistically independent. Leverage can vary between $[-1,1]$ . When the leverage is equal to zero, no associations can be found. When it is positive/negative, a positive/negative relationship of the antecedent and consequent can be determined.

[TABLE]

2.2 Advantages of using association rules as local explainers

In this section we discuss the advantages of considering association rules as local explainers over other interpretable models such as decision trees or linear models.

2.2.1 Rules versus feature importance of linear models

The prediction of a linear model provides the target as a weighted sum of the feature inputs, and therewith can represent only linear relationships rather than non-linear relationships. Another drawback of linear models is that if there are input variables that are highly correlated, the estimated coefficients of the model can be high for either of the correlated features, which does not affect the predictive abilities of the model but leads to poor explanations.

2.2.2 Association rules are local models

A typical machine learning algorithm for prediction will have to produce a single global model that is often the result of some form of model selection. In this process, interesting findings of the dataset that may contain useful explanations may get lost and not be considered, as usually there will be not one single valid model, but different potentially equally valid ones. Webb [40] discussed the characteristics of association rules being local models, as they consider only certain features and only certain values of these features, thus only considering a subspace of the feature space. Association rule mining aims to discover all such local models. For example, if there are two equally predictive rules, we aim to find both of these rules which help users when making decisions. Also, the globally optimal model may not be the optimal solution for a locally defined region in the subspace. Association rule mining can find the optimal models in any specified region, which will be more efficient than a global model [40, 25].

2.2.3 Rules that predict vs rules that explain

Predictability of a rule can be measured using the confidence of the rule and the interestingness of a rule can be measured by the lift or the leverage. Furthermore, when the value of the confidence is high, the rule is said to be a predictive rule and when the value of the lift or the leverage is high, the rule is said to be an interesting rule. Novak et al. [27] discuss the differences between interpretability and predictability, by showing that the most predictive rules and the rules that explain best on a given dataset will be usually different. Using the example of a C4.5 decision tree for a predictive algorithm, they illustrate that redundant rules will be ignored, while in descriptive algorithms, redundant rules should be considered. On the other hand, highly predictive rules may result from spurious correlations in the training data, if they represent only a small number of examples. Such rules will be filtered out by an adequate descriptive algorithm accordingly, while a predictive algorithm may be forced to take such rules into account for the sake of completeness of the predictions. Thus, though rules may not be predictive, they may still be of interest to understand a dataset, and on the other hand, even if rules are highly predictive they may not be useful for explanations.

2.2.4 Mining interesting rules efficiently

The traditional approach for association rule mining is the Apriori approach [1] whose fundamental step is to find the itemsets that occur most often, the so-called frequent itemsets. To select frequent itemsets, a minimum support needs to be given as a parameter.

Association rules are then determined within these frequent itemsets. The frequent itemset association paradigm has a number of drawbacks, such as its inability to uncover higher order associations, as these are comparatively infrequent. This is known as the so-called vodka and caviar problem [7]. Another limitation of the approach is that minimum support is not a reasonable threshold to regulate the number of associations, as it is impossible to select the number of associations in advance. So it boils down to a trial and error process, and therewith the minimum support constraint is not a faithful parameter to acquire interesting association rules [40].

Webb [38] presents the Optimized Pruning for Unordered Search (OPUS) algorithm, that overcomes many of the shortcomings of other association rule mining techniques. It employs a statistically sound process of selecting the top k interesting rules, i.e the rules with the highest support, coverage, confidence, leverage, or lift. Furthermore, it rigorously controls the generation of spurious rules [40] and has the ability to enable filter modes to adjust the length of the rules to be discovered according to input parameters. When generating interesting rules (rules with a non-trivial correlation between antecedent and consequent), to overcome false positives OPUS uses a Fisher’s exact hypothesis test [39].

3 Our Approach

The main goal of our approach LoRMIkA is to explain an instance locally, with k-optimal class association rules.

3.1 Formal definition

In local model-agnostic interpretability, we retain the global model as a black box and explain the decision (or prediction) provided by the global model for each individual instance [14]. Let $f$ be a given global black-box model, $x$ be an instance to be explained, and let $y$ be the prediction provided by the global model for $x$ , i.e., $f(x)=y$ . The aim is then to provide an explanation $e$ for the prediction $y$ . Therefore, we use a local explainer model $e$ where $e=\xi(f,x)$ , which mimics the behaviour of $f$ for $x$ using a process $\xi(.,.)$ . The local interpretability model considers the neighbourhood of the instance $x$ in the global model $f$ to provide the explanation $e\in E$ that belongs to a human-interpretable domain $E$ . In our case of rule-based explanations, $E$ is a domain of rules, i.e., an explanation $e=\{r_{1},r_{2},...\}$ provides a set of class association rules $r$ which explain the prediction $y$ from $f$ for the instance $x$ .

3.2 Overview of the procedure

Broadly, we first generate a neighbourhood that has similar behaviour to the instance to be explained. For that, first, we preprocess the input data (i.e., training data and instances to be explained). Then, we select the neighbourhood of the instance to be explained with the help of a distance measure. Then, we generate new instances in the neighbourhood with mutation and crossover/interpolation techniques. After that, we generate global model predictions for the whole set of instances in the neighbourhood (i.e., generated and selected training instances) to learn the behaviour of the global model. Finally, we perform association rule mining using the OPUS search [38] algorithm with k-optimal [40] associations for the combined set of instances and their global model predictions to produce class association rules as the explanations of the prediction. The resulting rules are then categorized into four categories based on the feature values and the predicted value of the global model of the instance to be explained. An overview diagram of LoRMIkA is shown in Figure 1. Our approach is furthermore summarized in Algorithm 1 and the most important parts are discussed in the following.

3.3 Preprocessing the input data

This step corresponds to the $\text{PreProcessingInputData}(Tr_{x},I_{e})$ function in Algorithm 1. We preprocess the training data by applying Z-score normalisation for each numerical feature and one-hot encoding for each categorical feature, since the distribution of each studied feature varies in range. Note that depending on the distribution of the features, other normalisation methods such as min-max scaling may yield better results, however Z-score normalisation is arguably a good default choice for the generic case.

3.4 Determine the instances in the neighbourhood

In this step, we find the training set instances which are in the neighbourhood of the instance to be explained since we assume that the behaviour and the properties of the instances in the neighbourhood are resembling the instance to be explained. The process of instance selection within the given neighbourhood is outlined in Algorithm 2 and Figure 2. There are many ways for selection, e.g., Yu et al. [45] use the Kullback-Leibler (KL) divergence. The process in our approach is similar to the approach proposed by Yu et al. [44] when grouping users according to their similarities in social behaviour. A detailed explanation of the instance selection process is provided in the following.

First, the distance between each pre-processed instance in the training set (background set) and the instance that needs to be explained is calculated. We use Euclidean distance as the distance measure to calculate the distance between each pre-processed instance in the training set and the instance that needs to be explained, as it is preferrable to other measures in the interpretability context [34, 13]. Similarly, Ribeiro et al. [34] used other distance measures such as cosine distance for text, and L2 distance for images. As we use the features directly as concepts for explanations, we effectively limit our approach to situations where single features can provide meaningful explanations.

We then convert the distance to an exponential similarity score to make the distance more linear and to compare the similarity of the instances in the training set with the explained instance. The similarity is defined as follows:

[TABLE]

In Equation 7, $K(x,x^{\prime})$ is the similarity between two instances $x$ and $x^{\prime}$ , and $w$ is the kernel width (see also Algorithm 2), a tuning parameter. When $w$ is high, $K(x,x^{\prime})$ will be close to $1$ for any $x,x^{\prime}$ . When $w$ is low, $K(x,x^{\prime})$ will be close to [math]. Following Ribeiro et al. [34], we set $w$ to 0.75 times the number of features.

After that, we group the similarity scores according to the target class and we sort them in descending order. Then we select the instances with top L (i.e., minimum number of neighbours from each class) similarities in each class. From the selected top L instances over all the target classes, the similarity score of the instance with the lowest similarity is used as the threshold to select the minimum number of most similar instances. This threshold value is $S_{c}t$ (i.e., cut point of the similarity score) in Algorithm 2. Further, considering the whole training set we select all the instances whose similarity score is more than $S_{c}t$ for further processing.

In order to overcome the class imbalance problem, we check whether the number of instances from each class is less than or equal to M (i.e., the maximum number of neighbours from each class). Then, we sample M number of instances from the classes where the number of instances is greater than M, to select as a set of neighbours of the instance to be explained. Further, the instances of the classes where the number of instances is less than or equal to M are also considered as the neighbours of the instance to be explained. The output of Algorithm 2 (i.e., $SelectInst$ ) contains the most similar instances of the training set to the instance to be explained.

3.5 Instance Generation

The amount of instances chosen from the original training set may not be enough to adequately characterise the neighbourhood of the instance in consideration. Therefore in our approach, we generate new instances using the neighbourhood instances of the training set based on crossover (i.e., interpolation) and mutation techniques, while ensuring that the majority of the new instances are inside the neighbourhood. The instance generation procedure is outlined in Algorithm 3. For the instance generation we perform the following steps.

After determining the instances in the neighbourhood, in this step we use the output of Algorithm 2 to generate more instances within the selected neighbourhood. We generate 50% of the total instances using the crossover technique defined in Equation 8. It ensures that all the generated instances are within the neighbourhood.

[TABLE]

Here, $x$ and $y$ are randomly selected instances from the training set in the neighbourhood of the instance to be explained, and $\alpha$ is a randomly generated number between [math] and $1$ .

Afterwards, we generate values for the categorical features for the newly generated instances. Here, we assume that the generated instance behaves similar to the instance to be explained if the generated instance values are similar to the values of the most similar parent instance to the instance to be explained. Having that assumption, we select the most similar parent instance (i.e., $x$ or $y$ ) to the instance to be explained. The categorical values for the newly generated instance are equal to the categorical values of the most similar instance.

Then, we use a mutation technique from Storn and Price [37], defined in Equation 9, to generate the rest of the newly generated instances.

[TABLE]

Here, $x,y,$ and $z$ are three different instances of the neighbourhood of the instance to be explained, and $\sigma$ is a randomly generated number between $0.5$ and $1$ . Similarly, when setting the values for the categorical features of the newly generated instances, we select the most similar instance from $x$ , $y$ , and $z$ to the instance to be explained and we set the values of the most similar instance to the newly generated instance.

3.6 Generate global model predictions

This step corresponds to $\mathrm{GetPredictFrmGlobalModel}(ComnInst,M)$ in Algorithm 1. The assumption behind the generation of global model predictions is that the logic to produce the global model prediction of the instance to be explained is similar to the logics of producing the global model predictions of its neighbourhood instances. Here, we obtain the global model predictions of the newly generated instances and the neighbours taken from the training set (i.e., $SelecInst\cup GenInst$ ) to obtain the global model behaviour for this combined dataset. The intuition in this step is to generate a new training dataset in the neighbourhood of the instance to be explained to fit the local explainer (i.e., association rule mining algorithm) in the locally defined neighbourhood of the instance to be explained. This newly generated training set has the global model predictions as the target values.

3.7 Generate class association rule-based explanations

This step relates to the $\mathrm{MOGenRules}(CombInst,Pred)$ function in Algorithm 1. We generate k optimal class association rules using the OPUS search algorithm [40] with regards to different objectives such as support, coverage, confidence, lift and leverage. Here, k is the maximum number of rules to be generated for each objective. In our approach (LoRMIkA) we provide both predictive rules (i.e., rules with high confidence) and interesting rules (i.e., rules with high lift). An implementation of OPUS search to generate association rules is available in the BigML platform [9].

The reason for using the OPUS algorithm for rule extraction rather than algorithms like CPAR [43], RIPPER [8], FOIL [32], or RCAR [3] is that those algorithms focus on finding minimal rule sets which lead to accurate predictions (i.e., high confidence). As discussed in Section 2.2.3 the main drawback of focusing on the most predictive rules is the inability to uncover higher-order associations, which are relatively infrequent. The OPUS algorithm is a statistically sound algorithm which captures infrequent higher order associations (interesting rules) while ignoring false positives. As stated before, we argue that the explanations provided by the local explainers should provide good explanations and not necessarily good predictions.

The inputs to the association rule generating OPUS algorithm are the instances in the neighbourhood of the instance that needs to be explained. Both instances that were selected from the training set and newly generated instances are considered. LoRMIkA provides flexibility to search for the k-optimal rules with the highest support, coverage, confidence, lift, or leverage. This optimisation criterion is a parameter of the method, and thus this choice can be made by the user, and both predictive and interesting rules can be generated. Also, as there is in general the risk of over-estimating the confidence when the coverage is low, the OPUS algorithm uses an m-estimate [11] to adjust the values of confidence and lift. This feature helps to avoid overfitting by discovering spurious highly predictive rules.

We classify the set of rules obtained from the OPUS algorithm into four categories with regards to a contingency table of the LHS and RHS of the association rules. The LHS of the rule is true if the antecedent of the rule agrees with the feature values of the instance to be explained. The RHS of the rule is true if the consequent of the rule agrees with the prediction provided by the global model of the instance to be explained.

Let us consider an example in a cancer risk prediction scenario, where for a patient $x$ the global model predicts this patient as a positive case ( $y=1$ ), i.e., the patient is at risk of developing a cancer. To explain this prediction LoRMIkA produces four types of rules as follows (also see Figure 3).

Current Supporting rules ( $\Re_{c\alpha}^{+}$ ): The rules that support the prediction of the global model.

Definition: if LHS = true, then RHS = true.

Interpretation: Conditions of $x$ that support and explain why the patient is predicted as a risk case ( $y=1$ ) by the global model.

Benefits: Observing the current supporting rules, the doctors can identify the conditions that lead to classify this patient as at risk by the global model. These rules indicate the features that this patient has that lead to a positive classification.

Approaches: Anchor, LORE and LoRMIkA provide current supporting rules. 2. 2.

Current Contradicting Rules ( $\Re_{c\beta}^{-}$ ): The rules that contradict the prediction of the global model.

Definition: if LHS = true, then RHS = false.

Interpretation: Conditions of $x$ that contradict the current prediction ( $y=1$ ) and would support its prediction as being negative ( $y=0$ ). These rules contradict the current prediction ( $y=1$ ).

Benefits: Observing the current contradicting rules indicates doctors the features that the current patient has that would lead to a negative prediction, i.e., that contradict the current prediction. In our example, these factors contribute to a prediction of not being at risk of cancer, and therewith are factors that the doctors should try not to change.

Approaches: Only LoRMIkA provides current contradicting rules. 3. 3.

Hypothetically Supporting Rules ( $\Re^{+}_{h\alpha}$ ) : The rules that would increase the probability of the prediction of the global model.

Definition: if LHS = false, then RHS = true.

Interpretation: Determine the conditions that are currently not satisfied by $x$ (they are hypothetical) that would further support the current prediction of $x$ being a risk case.

Benefits: If features of the patient changed towards the conditions in the rules, this would further increase the risk of $x$ to be positive. Thus, changes along the lines of these rules should be avoided in our example.

Approaches: Only LoRMIkA provides hypothetically supporting rules. 4. 4.

Hypothetically Contradicting Rules (Counterfactual Rules, $\Re^{-}_{h\beta}$ ): The rules that may invert the prediction of the global model.

Definition: if LHS = false, then RHS = false.

Interpretation: The conditions that are currently not satisfied by the features of $x$ (they are hypothetical) that would contradict the current prediction of $x$ as positive, i.e. they would contribute towards inverting the prediction to being negative ( $y=0$ , not being at risk).

Benefits: The counterfactual rules can help doctors to understand in which ways the characteristics of the patient needed to change to make the patient more likely to be classified as a negative patient.

Approaches: LORE and LoRMIkA provide hypothetically contradicting rules.

With these four kinds of rules, our approach is able to provide a complete picture of rules that explain the local neighbourhood of an instance and give the practitioner all the information for informed decision making.

A Python implementation of LoRMIKA is available from GitHub111https://github.com/DiliSR/LoRMIkA. It can be used together with an implementation of the OPUS class association rule mining algorithm, such as the one from BigML [9].

4 Experiments

In this section, we compare LoRMIkA with other state-of-the-art rule-based algorithms in both qualitative and quantitative aspects.

4.1 Overview of the datasets

We use the evaluation framework from Guidotti et al. [13] for our experiments, which employs three real-world classification datasets Adult [36], COMPAS [16], German [15]. Furthermore, we use the Covertype [10] dataset to evaluate our method on larger datasets. The datasets have both categorical and continuous features. Continuous features are discretized in our approach. In the Adult, COMPAS, and German datasets, each instance of a dataset represents a record that belongs to an individual. In the Covertype dataset, each instance represents a $30m\times 30m$ patch of forest that is classified as one of seven cover types.

The Adult dataset from the UCI Machine Learning Repository contains data about income levels in relation to demographic features. It contains 48,842 instances in total. The income, which is the target column of the dataset divides the whole dataset into two classes: “ $\leq 50K$ ” and “ $\geq 50K$ ”.

The COMPAS dataset from ProPublica includes the features used by the COMPAS algorithm to assess a criminal defendant’s likelihood to re-offend (Low, Medium and High). This dataset consists of over 10,000 instances and we have divided it into the two classes “Medium-Low” and “High” risk.

The German dataset from the UCI Machine Learning Repository contains 1,000 instances and classifies persons as “good” or “bad” creditors.

The Covertype dataset from the UCI Machine Learning Repository contains 581,012 instances. It is a multi-class classification dataset with seven classes. As multi-class classification is not the main focus of our paper, we convert it into a binary classification problem as follows. We consider the class which has the highest number of instances to be one class (“type = 2”) and merge all the other classes into the second class (“type = 1”), making it a one-vs-all classification.

4.2 Generate global model predictions

To generate predictions from global models, we perform the following steps. First, we select a set of machine learning models as global models to make predictions in our experiments. For the experiments, we consider five algorithms as the global black-box models. They are support vector machine (SVM), random forest (RF), logistic regression (LR), decision trees (DT) and multi-layer neural networks with ‘LBFGS’ solver (NN). We use these models in their implementations from the scikit-learn library [29], with their default parameters. The only parameter that we change from the default parameters is in the RF algorithm, where we set the number of trees to 100, to keep the hyper-parameters of the global models of our experimental setup consistent with the experimental setup of LORE [13]. Some of these models, for instance DT, can be considered interpretable globally. However, we argue that it is still worthwhile to generate local explanations, as they are targetted at the particular instance and can therewith be more relevant. We do not consider the SVM model for the Covertype dataset in our experiments, as it is computationally too expensive and we did not obtain results after 10 days of running time.

After preprocessing the datasets we impute missing values in both continuous and categorical variables of the dataset using mean and mode imputation, respectively. Moreover, we discretise continuous attributes into three sub-ranges, where the size of the sub-ranges is chosen for them to contain an approximately equal number of instances. Finally, to select the training set and testing set each dataset is randomly split using a 4:1 ratio.

4.3 Select rule-based model agnostic explainers for baseline comparison

For the baseline comparison, we use Anchor [35] and LORE [13] as two state-of-the-art local rule-based model-agnostic explainers.

Anchor

The main goal of Anchor is to provide if-then rules, which are called anchors. It provides rules with high confidence, where it guarantees that changing the values of the features which are not included in the rule will not affect the final prediction of the global model. Also, Anchor selects the rules with a minimum confidence of 95%. If multiple rules are generated with the same confidence, then it selects the rule with the highest coverage from those. As the output, Anchor provides a single current supporting rule which explains the prediction of the global model (i.e., $\Re_{c\alpha}^{+}$ ).

LORE

First, LORE generates new instances in the neighbourhood of the instance to be explained using a genetic algorithm. To learn the behaviour and the properties of the global model, it obtains the global model predictions of the generated instances. Finally, LORE obtains rule-based explanations after building a decision tree on the generated instances. As the output, LORE provides a single current supporting rule which explains the prediction of the global model (i.e., $\Re_{c\alpha}^{+}$ ) and rules which potentially help to invert the global model predictions which are called counterfactual rules (i.e., $\Re_{h\beta}^{-}$ ).

4.4 Parameter setup in LoRMIkA

In preliminary experiments, we determined that the parameters of our method are fairly robust, and consequently we use a common set of parameters in all three datasets as follows. From Algorithm 2, $L=40$ , $w=0.75\times\sqrt{\text{no. of features}}$ . We choose the target proportion of minority class instances to the majority class instance as 1:5. All runs are executed 10 times and averages are reported here, for the stability of the results.

4.5 Qualitative analysis

In this section, we illustrate with an example the performance of Anchor, LORE, and LoRMIkA. We consider a prediction for an instance of the COMPAS dataset. We use the random forest classifier that classifies the instance in consideration as having a 65% probability for the prisoner of having a “High” level of likelihood to re-offend. The feature values of the instance are as follows:

[TABLE]

In the following, we discuss the possibilities of releasing the prisoner and its implications, using the local explainer models.

What are the current conditions that support the prediction as having a High chance to re-offend?

This question can be answered using the current supporting rules ( $\Re^{+}_{c\alpha}$ ). The current supporting rules of LoRMIkA are as follows.

[TABLE]

The instance is being predicted as a prisoner with “High” likelihood to re-offend since the age is greater than or equal to 26 and less than or equal to 29.25, and this prisoner is a violent recidivist. Anchor and LORE produce the following rules to explain the same instance:

[TABLE]

LoRMIkA’s current supporting rule shares some similarity with Anchor and LORE. We find that ( $\Re^{+}_{{c\alpha\_}{{\text{LoRMIkA}}}}$ ) indicates that the prisoner has a high likelihood to re-offend if $26\leq age\leq 29.250$ . The supporting rules of LORE ( $\Re^{+}_{{c\alpha\_}{{\text{LORE}}}}$ ) indicate $age\leq 27$ , and Anchor ( $\Re^{+}_{{c\alpha\_}{{\text{Anchor}}}}$ ) indicates $25<age\leq 31$ .

However, we argue that LoRMIkA’s current supporting rule in this example is easier to understand and more informative than the rules provided by Anchor and LORE. Following Nauck [26], rules with a fewer number of conditions can be considered more interpretable than rules with a higher number of conditions. LORE ( $\Re^{+}_{{c\alpha\_}{{\text{LORE}}}}$ ) produces a single condition with the “age” feature. Though we can consider this as highly interpretable, it seems not enough to decide whether the prisoner is a “High” level risk case to re-offend. The current supporting rule of LoRMIkA ( $\Re^{+}_{{c\alpha\_}{{\text{LoRMIkA}}}}$ ) with two conditions is also brief and intuitive to understand, but it also can be considered highly related and relevant to the prediction in this question, as the other condition of $\Re^{+}_{{c\alpha\_}{{\text{LoRMIkA}}}}$ is $is\_violent\_recid=1$ . Anchor ( $\Re^{+}_{{c\alpha\_}{{\text{Anchor}}}}$ ) on the other hand provides a rather verbose rule with 11 conditions, which includes the conditions from the LoRMIkA rule, but points narrowly to the outcome of the particular decision, and seems harder to interpret due to its complexity.

What are the current conditions that contradict the prediction as having a High chance to re-offend?

As an answer to this question, we use the current contradicting rules ( $\Re^{-}_{c\beta}$ ). Below, we present the generated current contradicting rule of LoRMIkA. To the best of our knowledge, LoRMIkA is the only algorithm that provides current contradicting rules for the prediction.

[TABLE]

The rule can be used by decision-makers as suggestion which features should maintain their values to achieve a prediction of Medium_Low likelihood to re-offend in the future. In our example, the rule suggests that if this prisoner will re-offend at least one more time he will lose the only protective factor which potentially supports a prediction of low likelihood to re-offend. The actual characteristic of this prisoner is { $priors\_count=4$ )}, so this can be interpreted in a way that this prisoner is at a tipping point of being a regular and frequent offender.

We note that the rule has a high lift (2.31), but fairly low confidence (0.546) which determines that this rule is not highly predictive, but can be seen as highly interesting. Anchor and LORE do not consider $priors\_count$ of the prisoner for their current supporting rules, so that this information is also implicitly there in those algorithms, but our algorithm is able to make it explicit.

What are the hypothetical conditions that support the prediction as having a High chance to re-offend?

Below, we present the generated hypothetical supporting rule of LoRMIkA ( $\Re^{+}_{h\alpha}$ ). To the best of our knowledge, LoRMIkA is the only algorithm that provides hypothetical supporting rules for a prediction.

[TABLE]

The rule is in line with the current contradicting rule, emphasizing that a further increase of $priors\_count$ would very negatively affect predicted risks for this prisoner in particular.

What are the hypothetical conditions that could potentially invert the prediction as having a high chance to re-offend to medium_low chance to re-offend?

As a solution for this question, we use the hypothetical contradicting rules ( $\Re^{-}_{h\beta}$ ), which are often also called counterfactuals. Below, we present the generated hypothetically contradicting rules of LoRMIkA and LORE.

[TABLE]

If this prisoner was older than 29 and $priors\_count$ was less than 4, this person would be identified as a prisoner with “Medium_Low” likelihood to re-offend by the algorithm, according to LoRMIkA. The actual characteristics of this prisoner are { $age=27$ , $priors\_count=4$ }. Though $priors\_count$ in practice cannot be reduced and therewith the rule is not actionable, together with the other rules from above, the explanations suggest a prisoner at a tipping point, in terms of age and prior offences. The explanations suggest that if decision-makers can achieve that this prisoner will not re-offend in the next 2 years (approx.), the predicted risk will be lower afterwards. LoRMIkA’s hypothetically contradicting rule shares some similarity with LORE, in terms of the $age$ .

4.6 Quantitative analysis

In this section we present a quantitative analysis of the efficiency, stability, and interpretability of the rules produced by the different methods.

4.6.1 Analysis of the efficiency of the rules produced by LoRMIkA compared with the rules of the other state-of-the-art algorithms

Following the key assumption of the local interpretability theory [34, 35, 13], we investigate whether the rule-based explanation produced for a particular instance is homogeneous across its neighbourhood using the association rule mining measures (i.e., coverage, confidence, and lift).

The accuracy and the efficiency of the local rule-based model-agnostic explainers presented here is based on 1) the coverage that measures the fraction of instances in the neighbourhood which satisfy the LHS of the rule, 2) the confidence (i.e., precision, strength) which measures the percentage of the instances which satisfy the RHS of the rule, out of the instances selected for the coverage and 3) the lift which measures the interestingness of the rule.

Moreover, we compute the average time taken to generate a single explanation for a prediction produced by each global machine learning model.

Coverage

Table 1 shows the mean values for the coverage of the rules generated as the explanations of LoRMIkA, Anchor, and LORE for the COMPAS, Adult, German and Covertype datasets. The best values across the local explainers are indicated with a boldface font. When comparing the mean coverages of the rules over all the global models, LoRMIkA outperforms LORE and Anchor across all datasets and models, except one case, namely for the SVM model in the German dataset, where LORE performs better. We argue that this is the case due to the flexibility of LoRMIkA to search and obtain k-optimal rules with high coverage. As coverage measures how representative a rule is for a given dataset, our results show that LoRMIkA achieves the most representative rules.

Confidence

Table 2 shows the mean values for the confidence of the rules generated as the explanations of LoRMIkA, Anchor and LORE for COMPAS, Adult and German datasets. We see that LoRMIkA outperforms LORE and Anchor in most instances. In particular, it outperforms the comparison methods in all instances but the DT instances across all datasets, the NN in the Covertype dataset and the RF instance in the German dataset. Our results show that LORE and Anchor provide rule-based explanations with competitive mean confidence values since their main focus is to find the rules with the highest confidence. However, LoRMIkA is able to achieve the highest mean confidence in most cases.

Lift

Table 3 shows the rate of interestingness which is calculated from the lift as an absolute difference from $1$ (as a lift of $1$ shows there is no association between antecedent and consequent).

We note that the lift for Anchor is identical to its confidence, as it defines the neighbourhood in a way that it only contains positive instances. This way, the global model predictions of the instances in the neighbourhood are equal to the global model prediction of the instance to be explained. Per the definition of lift in Definition 5 the $Support(q)$ is equal to $1$ . Therewith, in Anchor the lift is equal to the confidence according to Definition 4. Thus, lift for Anchor is by definition between [math] and $1$ , as well as the rate of interestingness.

We can see from the table that LoRMIkA outperforms LORE and Anchor in all instances but one, which is the SVM in the German dataset for LORE. This again shows the flexibility of our algorithm LoRMIkA that makes it possible to obtain interesting rules with high lift and does not limit us to rules with the highest predictive power.

Computational Time

The experiments are run on an Intel(R) i7 processor (3.2 GHz), with a single thread per core, and 64GB of main memory. Table 4 shows the average running time over 50 instances measured in seconds, to produce explanations of the predictions generated by each global model for the Covertype dataset. From the table, we can see that Anchor runs fastest as an explainer for the LR and RF global models, whereas LORE runs the fastest for DT and NN. LoRMIkA has a very consistent running time around 38s across all the models.

Overall, our analysis shows that in most cases the efficiency of the rule-based explanations of LoRMIkA is superior to the LORE and Anchor baselines with respect to coverage, confidence, and rate of interestingness of the rules when explaining the predictions of the global models of COMPAS, Adult, German and Covertype datasets. With regards to the computational time, Anchor and LORE can be both considerably faster and considerably slower than LoRMIkA, while the latter one has a quite constant computation time.

4.6.2 Analysis of the stability of the rules produced by LoRMIkA compared with the rules of the other state-of-the-art algorithms

We evaluate the stability of the local rule-based model-agnostic explainers by calculating the similarity of the resulting rules over independent runs, using the Jaccard coefficient as defined in Equation 11.

[TABLE]

Here, $X$ and $Y$ are two given sets of features that are included in the rules from two runs. The Jaccard coefficient calculates the similarity [20] by comparing the common and distinct features in the two sets. The output ranges from 0 to 1, and the higher the coefficient the higher the similarity of rules over the two runs. In particular, we randomly select 50 instances and evaluate the stability of the generated explanations over 10 independent runs.

Table 5 indicates the mean and the standard deviation of the Jaccard coefficient. When measuring the Jaccard coefficient, we consider rules generated for highest confidence and rules which support the current prediction (i.e., $\Re^{+}_{c\alpha}$ ). LoRMIkA produces the rules with the highest stability compared with LORE and Anchor with a Jaccard coefficient of 1.00 in all occasions, indicating the high stability of the method. We argue that this high stability is due to the robust nature of our approach when selecting the neighbourhood as explained in Section 3.

4.6.3 Quantitative analysis of interpretability

As the interpretability of a given method is inherently difficult to measure, as a proxy we use the simplicity of the generated explanations [4]. We measure the total amount of features used in the antecedent of the given rule of the local rule-based explainers.

Table 6 indicates the mean values of the number of features used in a single rule to explain each instance using Anchor, LORE, and LoRMIkA. LoRMIkA outperforms LORE and Anchor across all datasets and models, except five cases, namely for the LR and NN models in the COMPAS dataset and the NN model in the German dataset, and LR and RF in the Covertype dataset where LORE performs better except LR in the Covertype dataset. This indicates that LoRMIkA is able to obtain more concise rules than Anchor and rules of comparable complexity to LORE.

5 Conclusions

We have presented Local Rule-based Model Interpretability with k-optimal Associations (LoRMIkA), a model-agnostic framework for local explainability of classification algorithms. It employs association rule mining techniques in a local neighbourhood to explain a particular instance and is therewith able to extract models that are local both with respect to features and instances. It uses the efficient OPUS search algorithm to extract top k-optimal rules with respect to measures such as support, coverage, confidence, lift, and leverage. This makes the framework flexible and allows us to extract simple rules that are not only predictive but also interesting and better suited to provide explanations. The experiments performed have shown that LoRMIkA is competitive and able to outperform state-of-the-art local explainers both in quantitative and qualitative aspects. Moreover, in contrast to state-of-the-art local explainers, it is able to provide multiple rules which explain the prediction in various aspects.

Acknowledgements

This research was supported by the Australian Research Council under grant DE190100045.

References

Agrawal et al. [1993]

Rakesh Agrawal, Tomasz Imieliński, and Arun Swami.

Mining association rules between sets of items in large databases.

SIGMOD Rec., 22(2):207–216, June 1993.

Arik and Pfister [2019]

Sercan Ömer Arik and Tomas Pfister.

Attention-based prototypical learning towards interpretable, confident and robust deep neural networks.

CoRR, abs/1902.06292, 2019.

URL http://arxiv.org/abs/1902.06292.

Azmi et al. [2019]

Mohamed Azmi, George Runger, and Abdelaziz Berrado.

Interpretable regularized class association rules algorithm for classification in a categorical data space.

Information sciences, 483:313–331, May 2019.

ISSN 0020-0255.

doi: 10.1016/j.ins.2019.01.047.

URL https://asu.pure.elsevier.com/en/publications/interpretable-regularized-class-association-rules-algorithm-for-c.

Bechlivanidis et al. [2017]

Christos Bechlivanidis, David A Lagnado, Jeffrey C Zemla, and Steven Sloman.

Concreteness and abstraction in everyday explanation.

Psychonomic bulletin & review, 24(5):1451–1464, October 2017.

Been et al. [2018]

Kim Been, Wattenberg Martin, Gilmer Justin, Cai Carrie, Wexler James, Viegas Fernanda, and Sayres Rory.

Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV).

In Proceedings of the 35 th International Conference on Machine Learning, 35, 2018.

Cano et al. [2013]

Alberto Cano, Amelia Zafra, and Sebastián Ventura.

An interpretable classification rule mining algorithm.

Information Sciences, 240:1–20, August 2013.

Cohen et al. [2001]

E Cohen, M Datar, S Fujiwara, A Gionis, P Indyk, R Motwani, J D Ullman, and C Yang.

Finding interesting associations without support pruning.

IEEE Trans. Knowl. Data Eng., 13(1):64–78, January 2001.

Cohen [1995]

William W Cohen.

Fast effective rule induction.

In Proceedings of the Twelfth International Conference on International Conference on Machine Learning, ICML’95, pages 115–123, San Francisco, CA, USA, 1995. Morgan Kaufmann Publishers Inc.

Donaldson [2019]

Justin Donaldson.

BigML, 2019.

URL https://www.bigml.com.

Dua and Graff [2017]

Dheeru Dua and Casey Graff.

UCI machine learning repository, 2017.

URL http://archive.ics.uci.edu/ml.

Džeroski et al. [1993]

Sašo Džeroski, Bojan Cestnik, and Igor Petrovski.

Using the m-estimate in rule induction.

Journal of computing and information technology, 1(1):37–46, March 1993.

Geng and Hamilton [2006]

Liqiang Geng and Howard J Hamilton.

Interestingness measures for data mining: A survey.

ACM Computing Surveys (CSUR), 38(3):9, September 2006.

Guidotti et al. [2018a]

Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Dino Pedreschi, Franco Turini, and Fosca Giannotti.

Local rule-based explanations of black box decision systems.

CoRR, abs/1805.10820, 2018a.

URL http://arxiv.org/abs/1805.10820.

Guidotti et al. [2018b]

Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi.

A survey of methods for explaining black box models.

ACM Comput. Surv., 51(5):93:1–93:42, August 2018b.

Hofmann [1994]

Hans Hofmann.

UCI machine learning repository: Statlog (german credit data) data set, November 1994.

URL https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data).

Jeff, Larson and Surya, Mattu and Lauren, Kirchner and Julia, Angwin [2017]

Jeff, Larson and Surya, Mattu and Lauren, Kirchner and Julia, Angwin.

compas-analysis, June 2017.

URL https://github.com/propublica/compas-analysis.

Kaufmann and Kalyanakrishnan [2013]

Emilie Kaufmann and Shivaram Kalyanakrishnan.

Information complexity in bandit subset selection.

In Conference on Learning Theory, pages 228–251, June 2013.

Khanna et al. [2018]

Rajiv Khanna, Been Kim, Joydeep Ghosh, and Oluwasanmi Koyejo.

Interpreting black box predictions using fisher kernels.

October 2018.

URL http://arxiv.org/abs/1810.10118.

Koh and Liang [2017]

Pang Wei Koh and Percy Liang.

Understanding black-box predictions via influence functions.

In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, pages 1885–1894. JMLR.org, August 2017.

Kuznetsov and Makhalova [2018]

S O Kuznetsov and T Makhalova.

On interestingness measures of formal concepts.

Information Sciences, 442-443:202–219, May 2018.

Lakkaraju et al. [2016]

Himabindu Lakkaraju, Stephen H Bach, and Leskovec Jure.

Interpretable decision sets: A joint framework for description and prediction.

KDD, 2016:1675–1684, August 2016.

Lakkaraju et al. [2019]

Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec.

Faithful and customizable explanations of black box models.

In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, 2019.

LiptonZachary [2018]

C LiptonZachary.

The mythos of model interpretability.

Queueing Systems. Theory and Applications, June 2018.

Lundberg and Lee [2017]

Scott M Lundberg and Su-In Lee.

A unified approach to interpreting model predictions.

In I Guyon, U V Luxburg, S Bengio, H Wallach, R Fergus, S Vishwanathan, and R Garnett, editors, Advances in Neural Information Processing Systems 30, pages 4765–4774. Curran Associates, Inc., 2017.

Martín et al. [2016]

D Martín, J Alcalá-Fdez, A Rosete, and F Herrera.

NICGAR: A niching genetic algorithm to mine a diverse set of interesting quantitative association rules.

Information Sciences, 355-356:208–228, August 2016.

Nauck [2002]

Detlef D Nauck.

Measuring interpretability in Rule-Based classification systems.

In Proc. of IEEE International Conference on Fuzzy Systems,2002, 2002.

Novak et al. [2009]

Petra Kralj Novak, Nada Lavrač, and Geoffrey I Webb.

Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining.

J. Mach. Learn. Res., 10(Feb):377–403, 2009.

Nowak-Brzezińska and Wakulicz-Deja [2019]

Agnieszka Nowak-Brzezińska and Alicja Wakulicz-Deja.

Exploration of rule-based knowledge bases: A knowledge engineer’s support.

Information Sciences, 485:301 – 318, 2019.

Pedregosa et al. [2011]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al.

Scikit-learn: Machine learning in python.

Journal of machine learning research, 12(Oct):2825–2830, 2011.

Proença and van Leeuwen [2019]

Hugo M Proença and Matthijs van Leeuwen.

Interpretable multiclass classification by MDL-based rule lists.

Information Sciences, October 2019.

Puri et al. [2017]

Nikaash Puri, Piyush Gupta, Pratiksha Agarwal, Sukriti Verma, and Balaji Krishnamurthy.

MAGIX: model agnostic globally interpretable explanations.

CoRR, abs/1706.07160, 2017.

URL http://arxiv.org/abs/1706.07160.

Quinlan and Cameron-Jones [1993]

J R Quinlan and R M Cameron-Jones.

FOIL: A midterm report.

In Machine Learning: ECML-93, pages 1–20. Springer Berlin Heidelberg, 1993.

Quinlan [1986]

J. Ross Quinlan.

Induction of decision trees.

Machine learning, 1(1):81–106, 1986.

Ribeiro et al. [2016]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin.

Why should I trust you?: Explaining the Predictions of Any Classifier.

In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDDM), pages 1135–1144, 2016.

Ribeiro et al. [2018]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin.

Anchors: High-Precision Model-Agnostic explanations.

In Thirty-Second AAAI Conference on Artificial Intelligence, April 2018.

Ronny and Barry [1996]

Kohavi Ronny and Becker Barry.

UCI machine learning repository: Adult data set, May 1996.

URL https://archive.ics.uci.edu/ml/datasets/adult.

Storn and Price [1997]

Rainer Storn and Kenneth Price.

Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces.

Journal of Global Optimization, 11(4):341–359, December 1997.

Webb [1995]

Geoffrey I Webb.

OPUS: An efficient admissible algorithm for unordered search.

The journal of artificial intelligence research, 3(1):431–465, December 1995.

ISSN 1076-9757.

Webb [2006]

Geoffrey I Webb.

Discovering significant rules.

In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’06, pages 434–443, New York, NY, USA, 2006. ACM.

ISBN 9781595933393.

Webb [2011a]

Geoffrey I Webb.

Filtered-top- k association discovery.

WIREs Data Mining Knowl Discov, 1(3):183–192, May 2011a.

Webb [2011b]

Geoffrey I Webb.

Filtered-top-k association discovery.

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(3):183–192, 2011b.

Yeh et al. [2018]

Chih-Kuan Yeh, Joon Kim, Ian En-Hsu Yen, and Pradeep K Ravikumar.

Representer point selection for explaining deep neural networks.

In S Bengio, H Wallach, H Larochelle, K Grauman, N Cesa-Bianchi, and R Garnett, editors, Advances in Neural Information Processing Systems 31, pages 9291–9301. Curran Associates, Inc., 2018.

Yin and Han [2003]

X Yin and J Han.

CPAR: Classification based on predictive association rules.

In Proceedings of the 2003 SIAM International Conference on Data Mining, Proceedings, pages 331–335. Society for Industrial and Applied Mathematics, May 2003.

Yu et al. [2018a]

J Yu, Z Kuang, B Zhang, W Zhang, D Lin, and J Fan.

Leveraging content sensitiveness and user trustworthiness to recommend Fine-Grained privacy settings for social image sharing.

IEEE Transactions on Information Forensics and Security, 13(5):1317–1332, May 2018a.

Yu et al. [2018b]

Zhou Yu, Jun Yu, Chenchao Xiang, Jianping Fan, and Dacheng Tao.

Beyond bilinear: Generalized multimodal factorized High-Order pooling for visual question answering.

IEEE transactions on neural networks and learning systems, 29(12):5947–5959, December 2018b.

Zhang et al. [2018]

Jian Zhang, Jun Yu, and Dacheng Tao.

Local Deep-Feature alignment for unsupervised dimension reduction.

IEEE transactions on image processing: a publication of the IEEE Signal Processing Society, 27(5):2420–2432, 2018.

Bibliography46

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Agrawal et al. [1993] Rakesh Agrawal, Tomasz Imieliński, and Arun Swami. Mining association rules between sets of items in large databases. SIGMOD Rec. , 22(2):207–216, June 1993.
2Arik and Pfister [2019] Sercan Ömer Arik and Tomas Pfister. Attention-based prototypical learning towards interpretable, confident and robust deep neural networks. Co RR , abs/1902.06292, 2019. URL http://arxiv.org/abs/1902.06292 .
3Azmi et al. [2019] Mohamed Azmi, George Runger, and Abdelaziz Berrado. Interpretable regularized class association rules algorithm for classification in a categorical data space. Information sciences , 483:313–331, May 2019. ISSN 0020-0255. doi: 10.1016/j.ins.2019.01.047 . URL https://asu.pure.elsevier.com/en/publications/interpretable-regularized-class-association-rules-algorithm-for-c .
4Bechlivanidis et al. [2017] Christos Bechlivanidis, David A Lagnado, Jeffrey C Zemla, and Steven Sloman. Concreteness and abstraction in everyday explanation. Psychonomic bulletin & review , 24(5):1451–1464, October 2017.
5Been et al. [2018] Kim Been, Wattenberg Martin, Gilmer Justin, Cai Carrie, Wexler James, Viegas Fernanda, and Sayres Rory. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV). In Proceedings of the 35 th International Conference on Machine Learning , 35, 2018.
6Cano et al. [2013] Alberto Cano, Amelia Zafra, and Sebastián Ventura. An interpretable classification rule mining algorithm. Information Sciences , 240:1–20, August 2013.
7Cohen et al. [2001] E Cohen, M Datar, S Fujiwara, A Gionis, P Indyk, R Motwani, J D Ullman, and C Yang. Finding interesting associations without support pruning. IEEE Trans. Knowl. Data Eng. , 13(1):64–78, January 2001.
8Cohen [1995] William W Cohen. Fast effective rule induction. In Proceedings of the Twelfth International Conference on International Conference on Machine Learning , ICML’95, pages 115–123, San Francisco, CA, USA, 1995. Morgan Kaufmann Publishers Inc.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

LoRMIkA: Local Rule-based Model Interpretability with k-optimal Associations

Abstract

keywords:

MSC:

1 Introduction

2 Association Rule Mining

2.1 Notations and Definitions

Definition 1**.**

Definition 2**.**

Definition 3**.**

Definition 4**.**

Definition 5**.**

Definition 6**.**

2.2 Advantages of using association rules as local explainers

2.2.1 Rules versus feature importance of linear models

2.2.2 Association rules are local models

2.2.3 Rules that predict vs rules that explain

2.2.4 Mining interesting rules efficiently

3 Our Approach

3.1 Formal definition

3.2 Overview of the procedure

3.3 Preprocessing the input data

3.4 Determine the instances in the neighbourhood

3.5 Instance Generation

3.6 Generate global model predictions

3.7 Generate class association rule-based explanations

4 Experiments

4.1 Overview of the datasets

4.2 Generate global model predictions

4.3 Select rule-based model agnostic explainers for baseline comparison

Anchor

LORE

4.4 Parameter setup in LoRMIkA

4.5 Qualitative analysis

What are the current conditions that support the prediction as having a High chance to re-offend?

What are the current conditions that contradict the prediction as having a High chance to re-offend?

What are the hypothetical conditions that support the prediction as having a High chance to re-offend?

What are the hypothetical conditions that could potentially invert the prediction as having a high chance to re-offend to medium_low chance to re-offend?

4.6 Quantitative analysis

4.6.1 Analysis of the efficiency of the rules produced by LoRMIkA compared with the rules of the other state-of-the-art algorithms

Coverage

Confidence

Lift

Computational Time

4.6.2 Analysis of the stability of the rules produced by LoRMIkA compared with the rules of the other state-of-the-art algorithms

4.6.3 Quantitative analysis of interpretability

5 Conclusions

Acknowledgements

References

Definition 1.

Definition 2.

Definition 3.

Definition 4.

Definition 5.

Definition 6.