Multicriteria interpretability driven Deep Learning
Marco Repetto

TL;DR
This paper introduces a multicriteria approach to embed interpretability into deep learning models from the start, using knowledge injection to control feature effects and handle complex effects, suitable for regulated fields.
Contribution
It proposes a novel multicriteria technique for integrating interpretability constraints directly into deep learning models through knowledge injection, including non-linear effects.
Findings
Creates interpretable, high-performance models for credit risk.
Enhances robustness against data scarcity biases.
Aligns with recent regulatory standards.
Abstract
Deep Learning methods are renowned for their performances, yet their lack of interpretability prevents them from high-stakes contexts. Recent model agnostic methods address this problem by providing post-hoc interpretability methods by reverse-engineering the model's inner workings. However, in many regulated fields, interpretability should be kept in mind from the start, which means that post-hoc methods are valid only as a sanity check after model training. Interpretability from the start, in an abstract setting, means posing a set of soft constraints on the model's behavior by injecting knowledge and annihilating possible biases. We propose a Multicriteria technique that allows to control the feature effects on the model's outcome by injecting knowledge in the objective function. We then extend the technique by including a non-linear knowledge function to account for more complex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
