Action recognition with spatial-temporal discriminative filter banks
Brais Martinez, Davide Modolo, Yuanjun Xiong, Joseph Tighe

TL;DR
This paper introduces spatial-temporal discriminative filter banks to enhance the last layers of CNNs for action recognition, achieving state-of-the-art results without altering the backbone network.
Contribution
It proposes a novel approach to improve network representation by enhancing the final layers using fine-grained recognition techniques, focusing on finer detail sensitivity.
Findings
Achieves state-of-the-art performance on Kinetics-400
Achieves state-of-the-art performance on Something-Something-V1
Improves sensitivity to finer details in action recognition
Abstract
Action recognition has seen a dramatic performance improvement in the last few years. Most of the current state-of-the-art literature either aims at improving performance through changes to the backbone CNN network, or they explore different trade-offs between computational efficiency and performance, again through altering the backbone network. However, almost all of these works maintain the same last layers of the network, which simply consist of a global average pooling followed by a fully connected layer. In this work we focus on how to improve the representation capacity of the network, but rather than altering the backbone, we focus on improving the last layers of the network, where changes have low impact in terms of computational cost. In particular, we show that current architectures have poor sensitivity to finer details and we exploit recent advances in the fine-grained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsResidual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Residual Block · Kaiming Initialization · Max Pooling · Convolution · Bitcoin Customer Service Number +1-833-534-1729
