Towards Gradient-based Time-Series Explanations through a SpatioTemporal Attention Network
Min Hun Lee

TL;DR
This paper investigates the use of a transformer-based spatiotemporal attention network (STAN) for generating gradient-based explanations in time-series data, particularly for video classification of medically relevant activities.
Contribution
It introduces a novel application of STAN combined with gradient-based XAI techniques for identifying salient frames in time-series videos.
Findings
STAN effectively identifies important frames in medical activity videos.
The approach demonstrates potential for explainability in time-series video analysis.
Experimental results show promising accuracy in highlighting relevant segments.
Abstract
In this paper, we explore the feasibility of using a transformer-based, spatiotemporal attention network (STAN) for gradient-based time-series explanations. First, we trained the STAN model for video classifications using the global and local views of data and weakly supervised labels on time-series data (i.e. the type of an activity). We then leveraged a gradient-based XAI technique (e.g. saliency map) to identify salient frames of time-series data. According to the experiments using the datasets of four medically relevant activities, the STAN model demonstrated its potential to identify important frames of videos.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Explainable Artificial Intelligence (XAI) · Data Visualization and Analytics
