VPN: Learning Video-Pose Embedding for Activities of Daily Living

Srijan Das; Saurav Sharma; Rui Dai; Francois Bremond; Monique Thonnat

arXiv:2007.03056·cs.CV·July 8, 2020

VPN: Learning Video-Pose Embedding for Activities of Daily Living

Srijan Das, Saurav Sharma, Rui Dai, Francois Bremond, Monique Thonnat

PDF

1 Repo

TL;DR

This paper introduces VPN, a novel video-pose embedding network that combines spatial pose and RGB cues with attention mechanisms to improve recognition of subtle and similar daily activities in videos.

Contribution

The paper proposes VPN, a new model integrating spatial embedding and attention networks to better capture fine-grained spatio-temporal patterns for activity recognition.

Findings

01

VPN outperforms state-of-the-art on NTU-RGB+D 120 and 60 datasets.

02

VPN achieves superior results on Toyota Smarthome dataset.

03

VPN demonstrates effectiveness on small-scale human-object interaction data.

Abstract

In this paper, we focus on the spatio-temporal aspect of recognizing Activities of Daily Living (ADL). ADL have two specific properties (i) subtle spatio-temporal patterns and (ii) similar visual patterns varying with time. Therefore, ADL may look very similar and often necessitate to look at their fine-grained details to distinguish them. Because the recent spatio-temporal 3D ConvNets are too rigid to capture the subtle visual patterns across an action, we propose a novel Video-Pose Network: VPN. The 2 key components of this VPN are a spatial embedding and an attention network. The spatial embedding projects the 3D poses and RGB cues in a common semantic space. This enables the action recognition framework to learn better spatio-temporal features exploiting both modalities. In order to discriminate similar actions, the attention network provides two functionalities - (i) an end-to-end…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

srijandas07/VPN
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.