When stakes are high: balancing accuracy and transparency with Model-Agnostic Interpretable Data-driven suRRogates
Roel Henckaerts, Katrien Antonio, Marie-Pier C\^ot\'e

TL;DR
This paper introduces maidrr, a model-agnostic method for creating interpretable surrogate models for tabular data that balance transparency and accuracy, especially useful in regulated industries.
Contribution
The paper presents a novel procedure for developing interpretable surrogates using partial dependence effects and feature engineering, improving transparency without sacrificing performance.
Findings
Maidrr's GLM closely approximates black box models.
Outperforms linear and tree surrogates in accuracy.
Effective in insurance claim frequency modeling.
Abstract
Highly regulated industries, like banking and insurance, ask for transparent decision-making algorithms. At the same time, competitive markets are pushing for the use of complex black box models. We therefore present a procedure to develop a Model-Agnostic Interpretable Data-driven suRRogate (maidrr) suited for structured tabular data. Knowledge is extracted from a black box via partial dependence effects. These are used to perform smart feature engineering by grouping variable values. This results in a segmentation of the feature space with automatic variable selection. A transparent generalized linear model (GLM) is fit to the features in categorical format and their relevant interactions. We demonstrate our R package maidrr with a case study on general insurance claim frequency modeling for six publicly available datasets. Our maidrr GLM closely approximates a gradient boosting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Explainable Artificial Intelligence (XAI) · Topic Modeling
