Two Stream Self-Supervised Learning for Action Recognition

Ahmed Taha; Moustafa Meshry; Xitong Yang; Yi-Ting Chen; Larry Davis

arXiv:1806.07383·cs.CV·June 20, 2018·6 cites

Two Stream Self-Supervised Learning for Action Recognition

Ahmed Taha, Moustafa Meshry, Xitong Yang, Yi-Ting Chen, Larry Davis

PDF

Open Access

TL;DR

This paper introduces a two-stream self-supervised learning method that leverages spatio-temporal signals for improved action recognition in videos, validated across multiple datasets.

Contribution

It proposes a novel two-stream architecture with sequence verification and spatio-temporal alignment tasks for self-supervised learning in video action recognition.

Findings

01

Effective on HMDB51, UCF101, and HDD datasets

02

Outperforms some baseline methods in self-supervised learning

03

Shows potential for generalization with further improvements

Abstract

We present a self-supervised approach using spatio-temporal signals between video frames for action recognition. A two-stream architecture is leveraged to tangle spatial and temporal representation learning. Our task is formulated as both a sequence verification and spatio-temporal alignment tasks. The former task requires motion temporal structure understanding while the latter couples the learned motion with the spatial representation. The self-supervised pre-trained weights effectiveness is validated on the action recognition task. Quantitative evaluation shows the self-supervised approach competence on three datasets: HMDB51, UCF101, and Honda driving dataset (HDD). Further investigations to boost performance and generalize validity are still required.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Gait Recognition and Analysis