Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos
Yuan Tian, Guangtao Zhai, Zhiyong Gao

TL;DR
This paper introduces a novel action perceptron synthesizer that learns optimal-scale kernels for video action recognition, enhancing performance across datasets with minimal additional computation.
Contribution
It proposes a data-driven kernel synthesis method with a universal feature fusion layer, improving traditional CNNs for video understanding tasks.
Findings
Achieves state-of-the-art results on challenging datasets.
Outperforms strong baselines with less than 30% of their computation.
Effectively captures scale variances in action primitives.
Abstract
Video action recognition has been partially addressed by the CNNs stacking of fixed-size 3D kernels. However, these methods may under-perform for only capturing rigid spatial-temporal patterns in single-scale spaces, while neglecting the scale variances across different action primitives. To overcome this limitation, we propose to learn the optimal-scale kernels from the data. More specifically, an \textit{action perceptron synthesizer} is proposed to generate the kernels from a bag of fixed-size kernels that are interacted by dense routing paths. To guarantee the interaction richness and the information capacity of the paths, we design the novel \textit{optimized feature fusion layer}. This layer establishes a principled universal paradigm that suffices to cover most of the current feature fusion techniques (e.g., channel shuffling, and channel dropout) for the first time. By inserting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Gait Recognition and Analysis
MethodsConvolution · 3D Convolution
