A Hierarchical Pose-Based Approach to Complex Action Understanding Using Dictionaries of Actionlets and Motion Poselets
Ivan Lillo, Juan Carlos Niebles, Alvaro Soto

TL;DR
This paper presents a hierarchical, pose-based model for complex human action recognition that automatically discovers active body parts, jointly learns motion and action representations, and improves robustness to pose estimation errors.
Contribution
It introduces a novel hierarchical model that learns without spatial supervision and effectively recognizes complex actions using body joint data.
Findings
Outperforms existing action recognition methods on multiple benchmarks.
Automatically discovers active body parts from temporal annotations.
Enhances robustness by discarding non-informative body parts.
Abstract
In this paper, we introduce a new hierarchical model for human action recognition using body joint locations. Our model can categorize complex actions in videos, and perform spatio-temporal annotations of the atomic actions that compose the complex action being performed.That is, for each atomic action, the model generates temporal action annotations by estimating its starting and ending times, as well as, spatial annotations by inferring the human body parts that are involved in executing the action. our model includes three key novel properties: (i) it can be trained with no spatial supervision, as it can automatically discover active body parts from temporal action annotations only; (ii) it jointly learns flexible representations for motion poselets and actionlets that encode the visual variability of body parts and atomic actions; (iii) a mechanism to discard idle or non-informative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
