Game-Theoretic Interpretability for Temporal Modeling
Guang-He Lee, David Alvarez-Melis, Tommi S. Jaakkola

TL;DR
This paper introduces a game-theoretic framework for interpreting temporal sequence models by co-operatively analyzing how well predictors conform to interpretable temporal families, enhancing interpretability without restricting model classes.
Contribution
It proposes a novel cooperative game approach for interpretability in temporal modeling that does not impose prior restrictions on predictor classes.
Findings
Framework effectively highlights local conformity to interpretable models
Demonstrations on temporal sequence models validate the approach
Enhances interpretability without sacrificing model flexibility
Abstract
Interpretability has arisen as a key desideratum of machine learning models alongside performance. Approaches so far have been primarily concerned with fixed dimensional inputs emphasizing feature relevance or selection. In contrast, we focus on temporal modeling and the problem of tailoring the predictor, functionally, towards an interpretable family. To this end, we propose a co-operative game between the predictor and an explainer without any a priori restrictions on the functional class of the predictor. The goal of the explainer is to highlight, locally, how well the predictor conforms to the chosen interpretable family of temporal models. Our co-operative game is setup asymmetrically in terms of information sets for efficiency reasons. We develop and illustrate the framework in the context of temporal sequence models with examples.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference · Topic Modeling
