Linear-time Online Action Detection From 3D Skeletal Data Using Bags of Gesturelets
Moustafa Meshry, Mohamed E. Hussein, Marwan Torki

TL;DR
This paper introduces a fast, linear-time online action detection method using 3D skeletal data that is suitable for real-time applications and invariant to temporal scale changes.
Contribution
It presents a novel online detection approach that efficiently identifies action intervals in 3D skeletal data with linear complexity, enabling real-time performance.
Findings
Achieves linear-time detection suitable for real-time applications
Invariant to temporal scale variations
Effective in identifying action intervals in unsegmented streams
Abstract
Sliding window is one direct way to extend a successful recognition system to handle the more challenging detection problem. While action recognition decides only whether or not an action is present in a pre-segmented video sequence, action detection identifies the time interval where the action occurred in an unsegmented video stream. Sliding window approaches for action detection can however be slow as they maximize a classifier score over all possible sub-intervals. Even though new schemes utilize dynamic programming to speed up the search for the optimal sub-interval, they require offline processing on the whole video sequence. In this paper, we propose a novel approach for online action detection based on 3D skeleton sequences extracted from depth data. It identifies the sub-interval with the maximum classifier score in linear time. Furthermore, it is invariant to temporal scale…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
