Simultaneous Joint and Object Trajectory Templates for Human Activity Recognition from 3-D Data
Saeed Ghodsi, Hoda Mohammadzade, Erfan Korki

TL;DR
This paper introduces a novel approach for human activity recognition from 3-D joint and object trajectories, using dynamic time warping and wavelet features to handle variations and interactions, achieving superior results on challenging datasets.
Contribution
It presents a new method for generating joint and object activity templates and warping samples to these templates, effectively modeling human-object interactions in 3-D data.
Findings
Outperforms state-of-the-art methods on multiple datasets
Effectively models human-object interactions in activity recognition
Handles variations in speed and style of actions
Abstract
The availability of low-cost range sensors and the development of relatively robust algorithms for the extraction of skeleton joint locations have inspired many researchers to develop human activity recognition methods using the 3-D data. In this paper, an effective method for the recognition of human activities from the normalized joint trajectories is proposed. We represent the actions as multidimensional signals and introduce a novel method for generating action templates by averaging the samples in a "dynamic time" sense. Then in order to deal with the variations in the speed and style of performing actions, we warp the samples to the action templates by an efficient algorithm and employ wavelet filters to extract meaningful spatiotemporal features. The proposed method is also capable of modeling the human-object interactions, by performing the template generation and temporal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
