Exploring the Relationship Between Feature Attribution Methods and Model Performance
Priscylla Silva, Claudio T. Silva, Luis Gustavo Nonato

TL;DR
This study investigates how the consistency among different feature attribution methods correlates with the predictive performance of models in educational contexts, highlighting the importance of explainability for model trustworthiness.
Contribution
It provides a comprehensive analysis of nine explanation methods and reveals a strong correlation between explanation agreement and model performance.
Findings
High correlation between explanation agreement and model accuracy
Multiple explanation methods tend to agree more on better-performing models
Enhancing explainability can improve trust in educational predictive models
Abstract
Machine learning and deep learning models are pivotal in educational contexts, particularly in predicting student success. Despite their widespread application, a significant gap persists in comprehending the factors influencing these models' predictions, especially in explainability within education. This work addresses this gap by employing nine distinct explanation methods and conducting a comprehensive analysis to explore the correlation between the agreement among these methods in generating explanations and the predictive model's performance. Applying Spearman's correlation, our findings reveal a very strong correlation between the model's performance and the agreement level observed among the explanation methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
