TL;DR
This paper introduces dissected 3D CNNs with temporal skip connections that enable efficient, online video processing by reducing computation and preserving temporal resolution, outperforming traditional 3D CNNs in accuracy and speed.
Contribution
The paper proposes dissected 3D-CNN architectures with temporal skip connections, enhancing online video recognition by reducing computation and maintaining temporal resolution.
Findings
77-90% fewer computations during online operation
Approximately 5% improvement in classification accuracy
Consistent performance gains across multiple vision tasks
Abstract
Convolutional Neural Networks with 3D kernels (3D-CNNs) currently achieve state-of-the-art results in video recognition tasks due to their supremacy in extracting spatiotemporal features within video frames. There have been many successful 3D-CNN architectures surpassing the state-of-the-art results successively. However, nearly all of them are designed to operate offline creating several serious handicaps during online operation. Firstly, conventional 3D-CNNs are not dynamic since their output features represent the complete input clip instead of the most recent frame in the clip. Secondly, they are not temporal resolution-preserving due to their inherent temporal downsampling. Lastly, 3D-CNNs are constrained to be used with fixed temporal input size limiting their flexibility. In order to address these drawbacks, we propose dissected 3D-CNNs, where the intermediate volumes of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods1x1 Convolution · Batch Normalization · *Communicated@Fast*How Do I Communicate to Expedia? · Residual Connection · Residual Block · Convolution · Bottleneck Residual Block · Kaiming Initialization · Average Pooling · Global Average Pooling
