Provably Explaining Neural Additive Models
Shahaf Bassan, Yizhak Yisrael Elboher, Tobias Ladner, Volkan \c{S}ahin, Jan Kretinsky, Matthias Althoff, Guy Katz

TL;DR
This paper introduces a new efficient algorithm for generating provably minimal explanations for Neural Additive Models, significantly improving interpretability and computational efficiency over previous methods.
Contribution
The authors develop a model-specific algorithm that produces cardinally-minimal explanations for NAMs with logarithmic verification queries, outperforming existing algorithms.
Findings
Produces smaller, provably minimal explanations
Reduces computation time compared to previous methods
Outperforms existing algorithms in explanation quality
Abstract
Despite significant progress in post-hoc explanation methods for neural networks, many remain heuristic and lack provable guarantees. A key approach for obtaining explanations with provable guarantees is by identifying a cardinally-minimal subset of input features which by itself is provably sufficient to determine the prediction. However, for standard neural networks, this task is often computationally infeasible, as it demands a worst-case exponential number of verification queries in the number of input features, each of which is NP-hard. In this work, we show that for Neural Additive Models (NAMs), a recent and more interpretable neural network family, we can efficiently generate explanations with such guarantees. We present a new model-specific algorithm for NAMs that generates provably cardinally-minimal explanations using only a logarithmic number of verification queries in…
Peer Reviews
Decision·ICLR 2026 Poster
- Certifying explanations is a key open research direction: the research problem highlighted by the authors is well-defined and relevant to the ML community. To the best of my knowledge this is the first approach specifically designed for NAMs - Logarithmic number of verification queries required is a step forward compared to baselines. Formal complexity analysis included. - Convincing experimental campaign, aligned with best-practice in the community. Performance of the approach shows interest
- Limited impact, as the proposed method is applicable to NAMs only by design. - Although evaluation is convincing, the adopted proxy quality metric for explanations is size, where smaller = better. A human assessment of the perceived quality of the resulting explanations would have made the work stronger - although I acknowledge this is a significant addition to the work (i.e. yes, consensus in the literature is the smaller the better, but what if the resulting explanations are *too* small, and
**S1.** The problem of computing minimum-size sufficient reasons is $\Sigma_p^2$-hard for neural networks (Barcelo et al., 2020). Therefore, it is reasonable to explore simpler model classes that can lead to efficient algorithms with a reasonable number of calls to an NP oracle. In this context, Neural Additive Models are a rational choice. **S2.** Experimental results across four datasets demonstrate that concise explanations can be generated in a reasonable amount of time.
**W1.** The computational complexity of finding minimum-size sufficient reasons for the class of NAMs has not been proven. Identifying the corresponding complexity class (P, NP, $\Sigma_p^2$) is crucial for justifying the overall interest in this approach. **W2.** Barcelo et al. (2020) have already demonstrated that the problem is solvable in polynomial time for linear threshold functions. Essentially, the theoretical results presented in this study are just an extension of their previous work,
Strengths. (1) Addresses an important formal-XAI problem (provable minimal explanations) and achieves a stronger guarantee (global cardinality minimality) than most prior work for neural networks. (2) Elegant exploitation of the additive NAM structure to reduce verification complexity. (3) Clear algorithmic presentation with theoretical propositions and proofs in the appendix. (4) Empirical results convincingly illustrate both efficiency and the need for provable guarantees.
Limitations / concerns. (1) The guarantees rely on an exact, sound verifier and on refining intervals until importance bounds separate. In practice, verifier soundness, numerical issues, or timeouts can undermine the provable guarantees; the authors acknowledge this but more discussion of practical mitigations (timeouts, numerical tolerances) would be useful. (2) The attractive complexity (logarithmic verifier calls) is obtained assuming parallel refinement across features. The number of wall-
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
