When stakes are high: balancing accuracy and transparency with   Model-Agnostic Interpretable Data-driven suRRogates

Roel Henckaerts; Katrien Antonio; Marie-Pier C\^ot\'e

arXiv:2007.06894·stat.ML·December 11, 2020

When stakes are high: balancing accuracy and transparency with Model-Agnostic Interpretable Data-driven suRRogates

Roel Henckaerts, Katrien Antonio, Marie-Pier C\^ot\'e

PDF

Open Access

TL;DR

This paper introduces maidrr, a model-agnostic method for creating interpretable surrogate models for tabular data that balance transparency and accuracy, especially useful in regulated industries.

Contribution

The paper presents a novel procedure for developing interpretable surrogates using partial dependence effects and feature engineering, improving transparency without sacrificing performance.

Findings

01

Maidrr's GLM closely approximates black box models.

02

Outperforms linear and tree surrogates in accuracy.

03

Effective in insurance claim frequency modeling.

Abstract

Highly regulated industries, like banking and insurance, ask for transparent decision-making algorithms. At the same time, competitive markets are pushing for the use of complex black box models. We therefore present a procedure to develop a Model-Agnostic Interpretable Data-driven suRRogate (maidrr) suited for structured tabular data. Knowledge is extracted from a black box via partial dependence effects. These are used to perform smart feature engineering by grouping variable values. This results in a segmentation of the feature space with automatic variable selection. A transparent generalized linear model (GLM) is fit to the features in categorical format and their relevant interactions. We demonstrate our R package maidrr with a case study on general insurance claim frequency modeling for six publicly available datasets. Our maidrr GLM closely approximates a gradient boosting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Explainable Artificial Intelligence (XAI) · Topic Modeling