Lifting Interpretability-Performance Trade-off via Automated Feature   Engineering

Alicja Gosiewska; Przemyslaw Biecek

arXiv:2002.04267·cs.LG·February 12, 2020·1 cites

Lifting Interpretability-Performance Trade-off via Automated Feature Engineering

Alicja Gosiewska, Przemyslaw Biecek

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to develop interpretable models with high accuracy by leveraging surrogate black-box models for feature engineering, validated through extensive benchmarking on tabular datasets.

Contribution

It proposes a novel approach that uses elastic black-box models to automate feature engineering, balancing interpretability and performance without manual feature crafting.

Findings

01

Extracting information from complex models can enhance linear model performance.

02

Complex models do not always outperform simpler linear models.

03

Automated feature engineering can improve interpretability without sacrificing accuracy.

Abstract

Complex black-box predictive models may have high performance, but lack of interpretability causes problems like lack of trust, lack of stability, sensitivity to concept drift. On the other hand, achieving satisfactory accuracy of interpretable models require more time-consuming work related to feature engineering. Can we train interpretable and accurate models, without timeless feature engineering? We propose a method that uses elastic black-boxes as surrogate models to create a simpler, less opaque, yet still accurate and interpretable glass-box models. New models are created on newly engineered features extracted with the help of a surrogate model. We supply the analysis by a large-scale benchmark on several tabular data sets from the OpenML database. There are two results 1) extracting information from complex models may improve the performance of linear models, 2) questioning a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

agosiewska/SAFE-experiments
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Data Stream Mining Techniques

MethodsInterpretability