Gradient Frequency Modulation for Visually Explaining Video Understanding Models
Xinmiao Lin, Wentao Bao, Matthew Wright, Yu Kong

TL;DR
This paper introduces Frequency-based Extremal Perturbation (F-EP), a novel method that uses frequency modulation of gradient maps to produce more consistent and faithful visual explanations for video understanding models, addressing the challenge of spatiotemporal coherence.
Contribution
The paper proposes a new explanation technique for video models that leverages frequency modulation of gradients to improve spatiotemporal consistency and fidelity of explanations.
Findings
F-EP outperforms existing methods in producing coherent explanations.
F-EP provides more faithful representations of model decisions.
Experiments demonstrate improved spatiotemporal consistency.
Abstract
In many applications, it is essential to understand why a machine learning model makes the decisions it does, but this is inhibited by the black-box nature of state-of-the-art neural networks. Because of this, increasing attention has been paid to explainability in deep learning, including in the area of video understanding. Due to the temporal dimension of video data, the main challenge of explaining a video action recognition model is to produce spatiotemporally consistent visual explanations, which has been ignored in the existing literature. In this paper, we propose Frequency-based Extremal Perturbation (F-EP) to explain a video understanding model's decisions. Because the explanations given by perturbation methods are noisy and non-smooth both spatially and temporally, we propose to modulate the frequencies of gradient maps from the neural network model with a Discrete Cosine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Model Reduction and Neural Networks · Generative Adversarial Networks and Image Synthesis
MethodsDiscrete Cosine Transform
