First-Take-All: Temporal Order-Preserving Hashing for 3D Action Videos
Jun Ye, Hao Hu, Kai Li, Guo-Jun Qi, Kien A. Hua

TL;DR
This paper introduces the First-Take-All (FTA) hashing method for 3D action video recognition, providing a scalable, invariant, and efficient approach that outperforms existing methods in accuracy.
Contribution
The paper presents a novel FTA hashing algorithm that encodes entire 3D action videos into fixed-length hash codes, improving recognition robustness and efficiency.
Findings
Achieves over 80% recognition accuracy on public datasets.
Produces compact hash codes with about 15 bits per frame.
Demonstrates robustness to temporal variations and scale changes.
Abstract
With the prevalence of the commodity depth cameras, the new paradigm of user interfaces based on 3D motion capturing and recognition have dramatically changed the way of interactions between human and computers. Human action recognition, as one of the key components in these devices, plays an important role to guarantee the quality of user experience. Although the model-driven methods have achieved huge success, they cannot provide a scalable solution for efficiently storing, retrieving and recognizing actions in the large-scale applications. These models are also vulnerable to the temporal translation and warping, as well as the variations in motion scales and execution rates. To address these challenges, we propose to treat the 3D human action recognition as a video-level hashing problem and propose a novel First-Take-All (FTA) Hashing algorithm capable of hashing the entire video…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques
