Aggregate Models, Not Explanations: Improving Feature Importance Estimation
Joseph Paillard, Angel Reyero Lobo, Denis A. Engemann, Bertrand Thirion

TL;DR
This paper demonstrates that aggregating ensemble models yields more accurate feature importance estimates than explaining individual models, especially for complex models, supported by theoretical analysis and empirical validation.
Contribution
It provides a theoretical framework showing model-level ensembling improves feature importance estimation over individual explanations for complex models.
Findings
Model ensembling reduces importance estimation error.
Ensembling at the model level outperforms individual explanations.
Validated on benchmarks and large-scale biomedical data.
Abstract
Feature-importance methods show promise in transforming machine learning models from predictive engines into tools for scientific discovery. However, due to data sampling and algorithmic stochasticity, expressive models can be unstable, leading to inaccurate variable importance estimates and undermining their utility in critical biomedical applications. Although ensembling offers a solution, deciding whether to explain a single ensemble model or aggregate individual model explanations is difficult due to the nonlinearity of importance measures and remains largely understudied. Our theoretical analysis, developed under assumptions accommodating complex state-of-the-art ML models, reveals that this choice is primarily driven by the model's excess risk. In contrast to prior literature, we show that ensembling at the model level provides more accurate variable-importance estimates,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis · Machine Learning in Healthcare
