Guaranteed Optimal Compositional Explanations for Neurons
Biagio La Rosa, Leilani H. Gilpin

TL;DR
This paper introduces a framework for computing guaranteed optimal compositional explanations for neurons, addressing the limitations of beam search and revealing that many existing explanations are suboptimal, especially with overlapping concepts.
Contribution
It presents the first method for guaranteed optimal explanations, including a decomposition, heuristic, and algorithm, improving explanation quality and computational efficiency.
Findings
10-40% of explanations are suboptimal with beam search.
The new method matches or improves runtime over prior approaches.
The framework reveals significant differences between optimal and non-optimal explanations.
Abstract
While neurons are the basic units of deep neural networks, it is still unclear what they learn and if their knowledge is aligned with that of humans. Compositional explanations aim to answer this question by describing the spatial alignment between neuron activations and concepts through logical rules. These logical descriptions are typically computed via a search over all possible concept combinations. Since computing the spatial alignment over the entire state space is computationally infeasible, the literature commonly adopts beam search to restrict the space. However, beam search cannot provide any theoretical guarantees of optimality, and it remains unclear how close current explanations are to the true optimum. In this theoretical paper, we address this gap by introducing the first framework for computing guaranteed optimal compositional explanations. Specifically, we propose: (i)…
Peer Reviews
Decision·Submitted to ICLR 2026
- The proposed dIoU decomposition is both intuitive and mathematically sound, while the branch-and-bound framework offers a rigorous alternative to heuristic search by ensuring correctness and optimality. - The paper provides clear empirical evidence of the conditions under which beam-based heuristics fail and quantifies the performance gap to the optimal solution. This helps resolve ambiguities in prior neuron interpretability research. - Beyond theoretical contributions, the authors leverage t
- In Table 2, [Beam + Our H.] does not guarantee optimality but shows substantial runtime reduction. However, the paper does not report how far its performance (IoU) falls short of the Optimal IoU. In addition, Appendix A.6 presents qualitative comparisons between M-MESH and Optimal, but a quantitative evaluation (IoU) should also be reported. I would recommend augmenting Table 3 to include the IoU achieved by [MMESH Beam] and [Beam + Our H.] alongside Beam IoU and Optimal IoU, so readers can as
- The problem (understanding the alignment between neuron activations and human-interpretable concepts) is timely and relevant to XAI research. - The authors attempt to formalise the notion of “optimal compositional explanation,” which, if properly developed, could be a valuable contribution. - The experimental section seems carefully executed and uses appropriate datasets from prior work.
(1) Almost every foundational concept in the paper is left ambiguous or inconsistently defined. These omissions make the entire problem formulation very difficult to follow. A non-expert reader cannot reconstruct what the proposed algorithm is even operating on. For example, it is never stated what kind of “neurons” are being studied—feed-forward units, convolutional filters, or something else. The experimental section reveals that the models are CNNs, but this should be stated explicitly in
The paper studies an important problem The search for better evaluation metrics of compositional explanations seems well motivated The efficiency results in the optimal compositional explanations are compelling The authors study 3 diverse datasets
The writing of the paper is often unclear. For example, the paper focuses on explanation of neurons in vision CNNs, although this is not clearly stated in the manuscript. In this vein, it would also be nice to see illustrative examples of compositional explanations and how they are improved by the proposed method. The introduced dIoU metric seems sensible but it would be nice to see it better motivated and compared against existing metrics rather than than simply stated as a sequence of definit
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning in Materials Science · Advanced Neural Network Applications
