Bandits for Learning to Explain from Explanations
Freya Behrens, Stefano Teso, Davide Mottin

TL;DR
This paper presents Explearn, an online algorithm using Gaussian Process-based contextual bandits to jointly learn predictions and explanations, offering controlled generalization and theoretical convergence guarantees.
Contribution
Introduction of Explearn, a novel GP-based contextual bandit algorithm that learns to generate explanations alongside predictions with provable convergence.
Findings
Initial experiments show promising results
GPs enable flexible explanation modeling
The approach guarantees convergence with high probability
Abstract
We introduce Explearn, an online algorithm that learns to jointly output predictions and explanations for those predictions. Explearn leverages Gaussian Processes (GP)-based contextual bandits. This brings two key benefits. First, GPs naturally capture different kinds of explanations and enable the system designer to control how explanations generalize across the space by virtue of choosing a suitable kernel. Second, Explearn builds on recent results in contextual bandits which guarantee convergence with high probability. Our initial experiments hint at the promise of the approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Explainable Artificial Intelligence (XAI) · Machine Learning and Algorithms
MethodsGreedy Policy Search
