# DynamoNet: Dynamic Action and Motion Network

**Authors:** Ali Diba, Vivek Sharma, Luc Van Gool, Rainer Stiefelhagen

arXiv: 1904.11407 · 2019-04-26

## TL;DR

DynamoNet introduces a novel 3D-CNN architecture with dynamic motion filters that adaptively learn video-specific motion representations through future frame prediction, enhancing human action recognition.

## Contribution

The paper proposes a new dynamic motion representation embedded in a 3D-CNN, jointly trained with classification and future frame prediction tasks for improved action recognition.

## Key findings

- Promising results on Kinetics 400, UCF101, HMDB51 datasets.
- Effective learning of video-specific motion features.
- Enhanced action recognition performance.

## Abstract

In this paper, we are interested in self-supervised learning the motion cues in videos using dynamic motion filters for a better motion representation to finally boost human action recognition in particular. Thus far, the vision community has focused on spatio-temporal approaches using standard filters, rather we here propose dynamic filters that adaptively learn the video-specific internal motion representation by predicting the short-term future frames. We name this new motion representation, as dynamic motion representation (DMR) and is embedded inside of 3D convolutional network as a new layer, which captures the visual appearance and motion dynamics throughout entire video clip via end-to-end network learning. Simultaneously, we utilize these motion representation to enrich video classification. We have designed the frame prediction task as an auxiliary task to empower the classification problem. With these overall objectives, to this end, we introduce a novel unified spatio-temporal 3D-CNN architecture (DynamoNet) that jointly optimizes the video classification and learning motion representation by predicting future frames as a multi-task learning problem. We conduct experiments on challenging human action datasets: Kinetics 400, UCF101, HMDB51. The experiments using the proposed DynamoNet show promising results on all the datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.11407/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1904.11407/full.md

## References

57 references — full list in the complete paper: https://tomesphere.com/paper/1904.11407/full.md

---
Source: https://tomesphere.com/paper/1904.11407