Multicriteria interpretability driven Deep Learning

Marco Repetto

arXiv:2111.14088·cs.LG·November 30, 2021

Multicriteria interpretability driven Deep Learning

Marco Repetto

PDF

Open Access

TL;DR

This paper introduces a multicriteria approach to embed interpretability into deep learning models from the start, using knowledge injection to control feature effects and handle complex effects, suitable for regulated fields.

Contribution

It proposes a novel multicriteria technique for integrating interpretability constraints directly into deep learning models through knowledge injection, including non-linear effects.

Findings

01

Creates interpretable, high-performance models for credit risk.

02

Enhances robustness against data scarcity biases.

03

Aligns with recent regulatory standards.

Abstract

Deep Learning methods are renowned for their performances, yet their lack of interpretability prevents them from high-stakes contexts. Recent model agnostic methods address this problem by providing post-hoc interpretability methods by reverse-engineering the model's inner workings. However, in many regulated fields, interpretability should be kept in mind from the start, which means that post-hoc methods are valid only as a sanity check after model training. Interpretability from the start, in an abstract setting, means posing a set of soft constraints on the model's behavior by injecting knowledge and annihilating possible biases. We propose a Multicriteria technique that allows to control the feature effects on the model's outcome by injecting knowledge in the objective function. We then extend the technique by including a non-linear knowledge function to account for more complex…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning