Beyond Temporal Pooling: Recurrence and Temporal Convolutions for   Gesture Recognition in Video

Lionel Pigou; A\"aron van den Oord; Sander Dieleman; Mieke Van; Herreweghe; Joni Dambre

arXiv:1506.01911·cs.CV·February 11, 2016

Beyond Temporal Pooling: Recurrence and Temporal Convolutions for Gesture Recognition in Video

Lionel Pigou, A\"aron van den Oord, Sander Dieleman, Mieke Van, Herreweghe, Joni Dambre

PDF

1 Repo

TL;DR

This paper introduces a novel neural network architecture combining recurrence and temporal convolutions, demonstrating significant improvements in gesture recognition accuracy over traditional pooling methods.

Contribution

It presents the first comprehensive study showing recurrence and temporal convolutions are essential for effective gesture recognition in videos.

Findings

01

Recurrence is crucial for capturing temporal dynamics in gesture recognition.

02

Adding temporal convolutions significantly improves performance.

03

Achieved state-of-the-art results on the Montalbano dataset.

Abstract

Recent studies have demonstrated the power of recurrent neural networks for machine translation, image captioning and speech recognition. For the task of capturing temporal structure in video, however, there still remain numerous open research questions. Current research suggests using a simple temporal feature pooling strategy to take into account the temporal aspect of video. We demonstrate that this method is not sufficient for gesture recognition, where temporal information is more discriminative compared to general video classification tasks. We explore deep architectures for gesture recognition in video and propose a new end-to-end trainable neural network architecture incorporating temporal convolutions and bidirectional recurrence. Our main contributions are twofold; first, we show that recurrence is crucial for this task; second, we show that adding temporal convolutions leads…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chriswegmann/drone_steering
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.