Deep Convolutional Poses for Human Interaction Recognition in Monocular   Videos

Marcel Sheeny de Moraes; Sankha Mukherjee; Neil M Robertson

arXiv:1612.03982·cs.CV·December 14, 2016

Deep Convolutional Poses for Human Interaction Recognition in Monocular Videos

Marcel Sheeny de Moraes, Sankha Mukherjee, Neil M Robertson

PDF

Open Access

TL;DR

This paper demonstrates that deep pose estimation from monocular RGB videos can effectively recognize human interactions, achieving high accuracy comparable to depth sensor methods, thus enabling interaction recognition with standard cameras.

Contribution

It introduces a novel five-step method leveraging deep pose estimation for interaction recognition in monocular videos, showing RGB cameras can match depth sensor performance.

Findings

01

Achieved 87.56% average accuracy on two-person interaction dataset.

02

RGB-based method performs comparably to depth sensor approaches.

03

Deep models enable effective human interaction recognition from monocular videos.

Abstract

Human interaction recognition is a challenging problem in computer vision and has been researched over the years due to its important applications. With the development of deep models for the human pose estimation problem, this work aims to verify the effectiveness of using the human pose in order to recognize the human interaction in monocular videos. This paper developed a method based on 5 steps: detect each person in the scene, track them, retrieve the human pose, extract features based on the pose and finally recognize the interaction using a classifier. The Two-Person interaction dataset was used for the development of this methodology. Using a whole sequence evaluation approach it achieved 87.56% of average accuracy of all interaction. Yun, et at achieved 91.10% using the same dataset, however their methodology used the depth sensor to recognize the interaction. The methodology…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Video Surveillance and Tracking Methods