Scale Coding Bag of Deep Features for Human Attribute and Action Recognition
Fahad Shahbaz Khan, Joost van de Weijer, Rao Muhammad Anwer, Andrew D., Bagdanov, Michael Felsberg, Jorma Laaksonen

TL;DR
This paper introduces scale coding techniques within a Bag of Deep Features framework to explicitly encode multi-scale information, improving human attribute and action recognition accuracy across multiple datasets.
Contribution
It proposes two novel scale coding strategies that explicitly incorporate multi-scale features into deep image representations, outperforming scale-invariant methods.
Findings
Outperforms scale-invariant approaches on five datasets
Improves recognition accuracy by explicitly encoding scale information
Combining scale coding with standard deep features yields state-of-the-art results
Abstract
Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding. Both in bag-of-words and the recently popular representations based on convolutional neural networks, local features are computed at multiple scales. However, these multi-scale convolutional features are pooled into a single scale-invariant representation. We argue that entirely scale-invariant image representations are sub-optimal and investigate approaches to scale coding within a Bag of Deep Features framework. Our approach encodes multi-scale information explicitly during the image encoding stage. We propose two strategies to encode multi-scale information explicitly in the final image representation. We validate our two scale coding techniques on five datasets: Willow,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
