Video BagNet: short temporal receptive fields increase robustness in long-term action recognition
Ombretta Strafforello, Xin Liu, Klamer Schutte, Jan van Gemert

TL;DR
Video BagNet demonstrates that limiting the temporal receptive field in 3D convolutional models enhances robustness to sub-action order variations in long-term video action recognition.
Contribution
This work introduces Video BagNet, a 3D ResNet-50 variant with controlled temporal receptive fields, showing improved robustness over traditional models.
Findings
Short receptive fields increase robustness to sub-action order changes.
Larger receptive fields are sensitive to sub-action order variations.
Experimental analysis on synthetic and real-world datasets supports these conclusions.
Abstract
Previous work on long-term video action recognition relies on deep 3D-convolutional models that have a large temporal receptive field (RF). We argue that these models are not always the best choice for temporal modeling in videos. A large temporal receptive field allows the model to encode the exact sub-action order of a video, which causes a performance decrease when testing videos have a different sub-action order. In this work, we investigate whether we can improve the model robustness to the sub-action order by shrinking the temporal receptive field of action recognition models. For this, we design Video BagNet, a variant of the 3D ResNet-50 model with the temporal receptive field size limited to 1, 9, 17 or 33 frames. We analyze Video BagNet on synthetic and real-world video datasets and experimentally compare models with varying temporal receptive fields. We find that short…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications
