Tuning for Trustworthiness -- Balancing Performance and Explanation Consistency in Neural Network Optimization
Alexander Hinterleitner, Thomas Bartz-Beielstein

TL;DR
This paper introduces a new multi-objective optimization framework that balances neural network predictive performance with explanation consistency, enhancing model interpretability and robustness.
Contribution
It proposes the concept of XAI consistency, develops metrics for it, and integrates it into hyperparameter tuning within a multi-objective framework using SPOT.
Findings
XAI consistency can be quantified and optimized during model tuning.
Trade-offs exist between performance and interpretability based on XAI consistency.
Balancing these objectives may lead to more robust and reliable models.
Abstract
Despite the growing interest in Explainable Artificial Intelligence (XAI), explainability is rarely considered during hyperparameter tuning or neural architecture optimization, where the focus remains primarily on minimizing predictive loss. In this work, we introduce the novel concept of XAI consistency, defined as the agreement among different feature attribution methods, and propose new metrics to quantify it. For the first time, we integrate XAI consistency directly into the hyperparameter tuning objective, creating a multi-objective optimization framework that balances predictive performance with explanation robustness. Implemented within the Sequential Parameter Optimization Toolbox (SPOT), our approach uses both weighted aggregation and desirability-based strategies to guide model selection. Through our proposed framework and supporting tools, we explore the impact of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
