TL;DR
This paper introduces model-agnostic complexity measures based on functional decomposition to improve the reliability and compactness of post-hoc interpretability methods for complex machine learning models.
Contribution
It proposes new complexity measures for models and demonstrates their use in optimizing interpretability and performance simultaneously.
Findings
Models with minimized complexity measures yield clearer interpretations.
Applying complexity measures improves the reliability of post-hoc explanations.
Multi-objective optimization balances model accuracy and interpretability.
Abstract
Post-hoc model-agnostic interpretation methods such as partial dependence plots can be employed to interpret complex machine learning models. While these interpretation methods can be applied regardless of model complexity, they can produce misleading and verbose results if the model is too complex, especially w.r.t. feature interactions. To quantify the complexity of arbitrary machine learning models, we propose model-agnostic complexity measures based on functional decomposition: number of features used, interaction strength and main effect complexity. We show that post-hoc interpretation of models that minimize the three measures is more reliable and compact. Furthermore, we demonstrate the application of these measures in a multi-objective optimization approach which simultaneously minimizes loss and complexity.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
