Beyond Spatial Pyramid Matching: Space-time Extended Descriptor for Action Recognition
Zhenzhong Lan, Alexander G. Hauptmann

TL;DR
This paper proposes a space-time extended descriptor for action recognition in videos, integrating spatio-temporal location into feature encoding to improve efficiency and accuracy over traditional spatio-temporal pyramids.
Contribution
It introduces a simple, efficient spatio-temporal location encoding method that outperforms or matches existing pyramid-based approaches in action recognition tasks.
Findings
Achieves comparable or better results than spatio-temporal pyramid methods.
Reduces dimensionality and overfitting risks in video feature encoding.
Demonstrates effectiveness across multiple benchmark datasets.
Abstract
We address the problem of generating video features for action recognition. The spatial pyramid and its variants have been very popular feature models due to their success in balancing spatial location encoding and spatial invariance. Although it seems straightforward to extend spatial pyramid to the temporal domain (spatio-temporal pyramid), the large spatio-temporal diversity of unconstrained videos and the resulting significantly higher dimensional representations make it less appealing. This paper introduces the space-time extended descriptor, a simple but efficient alternative way to include the spatio-temporal location into the video features. Instead of only coding motion information and leaving the spatio-temporal location to be represented at the pooling stage, location information is used as part of the encoding step. This method is a much more effective and efficient location…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Gait Recognition and Analysis
