Exploring Temporal Differences in 3D Convolutional Neural Networks

Gagan Kanojia; Sudhakar Kumawat; Shanmuganathan Raman

arXiv:1909.03309·cs.CV·September 10, 2019

Exploring Temporal Differences in 3D Convolutional Neural Networks

Gagan Kanojia, Sudhakar Kumawat, Shanmuganathan Raman

PDF

TL;DR

This paper introduces a novel convolutional block that captures spatio-temporal information efficiently by combining 2D convolutions with simple temporal difference operations, reducing parameters compared to traditional 3D CNNs.

Contribution

It proposes a parameter-efficient convolutional block that leverages temporal differences, outperforming standard 3D CNNs on benchmark datasets.

Findings

01

The proposed block has n times fewer parameters than nxnxn 3D convolutions.

02

Replacing 3D convolutions with the proposed blocks improves performance.

03

The method is effective on UCF101 and ModelNet datasets.

Abstract

Traditional 3D convolutions are computationally expensive, memory intensive, and due to large number of parameters, they often tend to overfit. On the other hand, 2D CNNs are less computationally expensive and less memory intensive than 3D CNNs and have shown remarkable results in applications like image classification and object recognition. However, in previous works, it has been observed that they are inferior to 3D CNNs when applied on a spatio-temporal input. In this work, we propose a convolutional block which extracts the spatial information by performing a 2D convolution and extracts the temporal information by exploiting temporal differences, i.e., the change in the spatial information at different time instances, using simple operations of shift, subtract and add without utilizing any trainable parameters. The proposed convolutional block has same number of parameters as of a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods3D Convolution · Convolution