Featureless: Bypassing feature extraction in action categorization
Silvia L. Pintea, and Pascal S. Mettes, and Jan C. van Gemert, and, Arnold W. M. Smeulders

TL;DR
This paper presents a novel action categorization method that bypasses traditional feature extraction by directly learning to predict video representations from raw data, leveraging a discriminative Waldboost model for efficient classification.
Contribution
It introduces a featureless approach for action recognition that directly predicts video representations from raw data, improving efficiency and flexibility over traditional feature-based methods.
Findings
Achieves competitive accuracy on UCF11 dataset
Demonstrates computational efficiency comparable to feature-based methods
Supports both 2D and 3D video representation prediction
Abstract
This method introduces an efficient manner of learning action categories without the need of feature estimation. The approach starts from low-level values, in a similar style to the successful CNN methods. However, rather than extracting general image features, we learn to predict specific video representations from raw video data. The benefit of such an approach is that at the same computational expense it can predict 2 D video representations as well as 3 D ones, based on motion. The proposed model relies on discriminative Waldboost, which we enhance to a multiclass formulation for the purpose of learning video representations. The suitability of the proposed approach as well as its time efficiency are tested on the UCF11 action recognition dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
