Model Interpretability through the Lens of Computational Complexity
Pablo Barcel\'o, Mika\"el Monet, Jorge P\'erez, Bernardo Subercaseaux

TL;DR
This paper proposes a formal, complexity-theoretic framework for assessing model interpretability, showing linear and tree-based models are more interpretable than neural networks, with shallow networks being more interpretable than deep ones.
Contribution
It introduces a computational complexity-based definition of interpretability and applies it to compare different model classes, providing theoretical insights aligned with common beliefs.
Findings
Linear and tree-based models are more interpretable than neural networks under complexity assumptions.
Shallow neural networks are more interpretable than deeper ones based on parameterized complexity.
The analysis reveals nuanced differences in interpretability depending on the explanation type.
Abstract
In spite of several claims stating that some models are more interpretable than others -- e.g., "linear models are more interpretable than deep neural networks" -- we still lack a principled notion of interpretability to formally compare among different classes of models. We make a step towards such a notion by studying whether folklore interpretability claims have a correlate in terms of computational complexity theory. We focus on local post-hoc explainability queries that, intuitively, attempt to answer why individual inputs are classified in a certain way by a given model. In a nutshell, we say that a class of models is more interpretable than another class , if the computational complexity of answering post-hoc queries for models in is higher than for those in . We prove that this notion provides a good theoretical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms
MethodsInterpretability
