Provably Explaining Neural Additive Models

Shahaf Bassan; Yizhak Yisrael Elboher; Tobias Ladner; Volkan \c{S}ahin; Jan Kretinsky; Matthias Althoff; Guy Katz

arXiv:2602.17530·cs.LG·February 20, 2026

Provably Explaining Neural Additive Models

Shahaf Bassan, Yizhak Yisrael Elboher, Tobias Ladner, Volkan \c{S}ahin, Jan Kretinsky, Matthias Althoff, Guy Katz

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a new efficient algorithm for generating provably minimal explanations for Neural Additive Models, significantly improving interpretability and computational efficiency over previous methods.

Contribution

The authors develop a model-specific algorithm that produces cardinally-minimal explanations for NAMs with logarithmic verification queries, outperforming existing algorithms.

Findings

01

Produces smaller, provably minimal explanations

02

Reduces computation time compared to previous methods

03

Outperforms existing algorithms in explanation quality

Abstract

Despite significant progress in post-hoc explanation methods for neural networks, many remain heuristic and lack provable guarantees. A key approach for obtaining explanations with provable guarantees is by identifying a cardinally-minimal subset of input features which by itself is provably sufficient to determine the prediction. However, for standard neural networks, this task is often computationally infeasible, as it demands a worst-case exponential number of verification queries in the number of input features, each of which is NP-hard. In this work, we show that for Neural Additive Models (NAMs), a recent and more interpretable neural network family, we can efficiently generate explanations with such guarantees. We present a new model-specific algorithm for NAMs that generates provably cardinally-minimal explanations using only a logarithmic number of verification queries in…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 8Confidence 2

Strengths

- Certifying explanations is a key open research direction: the research problem highlighted by the authors is well-defined and relevant to the ML community. To the best of my knowledge this is the first approach specifically designed for NAMs - Logarithmic number of verification queries required is a step forward compared to baselines. Formal complexity analysis included. - Convincing experimental campaign, aligned with best-practice in the community. Performance of the approach shows interest

Weaknesses

- Limited impact, as the proposed method is applicable to NAMs only by design. - Although evaluation is convincing, the adopted proxy quality metric for explanations is size, where smaller = better. A human assessment of the perceived quality of the resulting explanations would have made the work stronger - although I acknowledge this is a significant addition to the work (i.e. yes, consensus in the literature is the smaller the better, but what if the resulting explanations are *too* small, and

Reviewer 02Rating 2Confidence 4

Strengths

**S1.** The problem of computing minimum-size sufficient reasons is $\Sigma_p^2$-hard for neural networks (Barcelo et al., 2020). Therefore, it is reasonable to explore simpler model classes that can lead to efficient algorithms with a reasonable number of calls to an NP oracle. In this context, Neural Additive Models are a rational choice. **S2.** Experimental results across four datasets demonstrate that concise explanations can be generated in a reasonable amount of time.

Weaknesses

**W1.** The computational complexity of finding minimum-size sufficient reasons for the class of NAMs has not been proven. Identifying the corresponding complexity class (P, NP, $\Sigma_p^2$) is crucial for justifying the overall interest in this approach. **W2.** Barcelo et al. (2020) have already demonstrated that the problem is solvable in polynomial time for linear threshold functions. Essentially, the theoretical results presented in this study are just an extension of their previous work,

Reviewer 03Rating 8Confidence 4

Strengths

Strengths. (1) Addresses an important formal-XAI problem (provable minimal explanations) and achieves a stronger guarantee (global cardinality minimality) than most prior work for neural networks. (2) Elegant exploitation of the additive NAM structure to reduce verification complexity. (3) Clear algorithmic presentation with theoretical propositions and proofs in the appendix. (4) Empirical results convincingly illustrate both efficiency and the need for provable guarantees.

Weaknesses

Limitations / concerns. (1) The guarantees rely on an exact, sound verifier and on refining intervals until importance bounds separate. In practice, verifier soundness, numerical issues, or timeouts can undermine the provable guarantees; the authors acknowledge this but more discussion of practical mitigations (timeouts, numerical tolerances) would be useful. (2) The attractive complexity (logarithmic verifier calls) is obtained assuming parallel refinement across features. The number of wall-

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications