TL;DR
This paper introduces gradient rollback, a scalable and robust influence estimation method for neural models like matrix factorization, enabling faithful explanations of model predictions especially in knowledge graph and recommender systems.
Contribution
Gradient rollback offers an efficient influence estimation approach applicable to neural models with sparse parameter updates, improving explainability in large-scale neural matrix factorization models.
Findings
Gradient rollback is highly efficient during training and testing.
It provides more faithful explanations compared to influence functions.
Theoretical bounds show it closely approximates true influence.
Abstract
Explaining the predictions of neural black-box models is an important problem, especially when such models are used in applications where user trust is crucial. Estimating the influence of training examples on a learned neural model's behavior allows us to identify training examples most responsible for a given prediction and, therefore, to faithfully explain the output of a black-box model. The most generally applicable existing method is based on influence functions, which scale poorly for larger sample sizes and models. We propose gradient rollback, a general approach for influence estimation, applicable to neural models where each parameter update step during gradient descent touches a smaller number of parameters, even if the overall number of parameters is large. Neural matrix factorization models trained with gradient descent are part of this model class. These models are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
