# Saliency Tubes: Visual Explanations for Spatio-Temporal Convolutions

**Authors:** Alexandros Stergiou, Georgios Kapidis, Grigorios Kalliatakis, Christos, Chrysoulas, Remco Veltkamp, Ronald Poppe

arXiv: 1902.01078 · 2020-06-24

## TL;DR

Saliency Tubes provide visual explanations for 3D CNNs in video classification by highlighting the most informative spatio-temporal regions, improving interpretability of complex deep learning models.

## Contribution

The paper introduces Saliency Tubes, a novel visualization method for understanding 3D CNNs' focus areas in videos, enhancing model interpretability.

## Key findings

- Effective visualization of network focus points in videos.
- Improved understanding of 3D CNN decision-making.
- Validated on multiple action classification datasets.

## Abstract

Deep learning approaches have been established as the main methodology for video classification and recognition. Recently, 3-dimensional convolutions have been used to achieve state-of-the-art performance in many challenging video datasets. Because of the high level of complexity of these methods, as the convolution operations are also extended to additional dimension in order to extract features from them as well, providing a visualization for the signals that the network interpret as informative, is a challenging task. An effective notion of understanding the network's inner-workings would be to isolate the spatio-temporal regions on the video that the network finds most informative. We propose a method called Saliency Tubes which demonstrate the foremost points and regions in both frame level and over time that are found to be the main focus points of the network. We demonstrate our findings on widely used datasets for third-person and egocentric action classification and enhance the set of methods and visualizations that improve 3D Convolutional Neural Networks (CNNs) intelligibility.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.01078/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1902.01078/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/1902.01078/full.md

---
Source: https://tomesphere.com/paper/1902.01078