From Pose to Activity: Surveying Datasets and Introducing CONVERSE
Michael Edwards, Jingjing Deng, Xianghua Xie

TL;DR
This paper reviews existing human action recognition datasets, highlights gaps in interaction modeling, and introduces a new 3D pose-based dataset capturing complex conversational interactions to advance research.
Contribution
It provides a comprehensive survey of current datasets and introduces CONVERSE, a novel dataset focusing on complex conversational interactions using 3D pose data.
Findings
Existing datasets lack complex interaction scenarios.
The new dataset captures subtle, real-world conversational actions.
Recognition of conversational interactions from 3D pose is feasible.
Abstract
We present a review on the current state of publicly available datasets within the human action recognition community; highlighting the revival of pose based methods and recent progress of understanding person-person interaction modeling. We categorize datasets regarding several key properties for usage as a benchmark dataset; including the number of class labels, ground truths provided, and application domain they occupy. We also consider the level of abstraction of each dataset; grouping those that present actions, interactions and higher level semantic activities. The survey identifies key appearance and pose based datasets, noting a tendency for simplistic, emphasized, or scripted action classes that are often readily definable by a stable collection of sub-action gestures. There is a clear lack of datasets that provide closely related actions, those that are not implicitly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Multimodal Machine Learning Applications
