Quantifying and Learning Static vs. Dynamic Information in Deep Spatiotemporal Networks
Matthew Kowal, Mennatullah Siam, Md Amirul Islam, Neil D. B. Bruce,, Richard P. Wildes, Konstantinos G. Derpanis

TL;DR
This paper introduces a method to quantify static versus dynamic information biases in deep spatiotemporal models, revealing prevalent static bias and proposing techniques to mitigate it for improved video analysis.
Contribution
It presents a novel approach for measuring static and dynamic biases in models and applies it across multiple video tasks, offering insights and debiasing strategies.
Findings
Most models are biased toward static information.
Some datasets are biased toward static rather than dynamic cues.
Individual channels can be selectively biased toward static or dynamic information.
Abstract
There is limited understanding of the information captured by deep spatiotemporal models in their intermediate representations. For example, while evidence suggests that action recognition algorithms are heavily influenced by visual appearance in single frames, no quantitative methodology exists for evaluating such static bias in the latent representation compared to bias toward dynamics. We tackle this challenge by proposing an approach for quantifying the static and dynamic biases of any spatiotemporal model, and apply our approach to three tasks, action recognition, automatic video object segmentation (AVOS) and video instance segmentation (VIS). Our key findings are: (i) Most examined models are biased toward static information. (ii) Some datasets that are assumed to be biased toward dynamics are actually biased toward static information. (iii) Individual channels in an architecture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications
MethodsDropout
