PNL: Efficient Long-Range Dependencies Extraction with Pyramid Non-Local Module for Action Recognition
Yuecong Xu, Haozhi Cao, Jianfei Yang, Kezhi Mao, Jianxiong Yin and, Simon See

TL;DR
This paper introduces the Pyramid Non-Local (PNL) module, enhancing long-range dependency modeling in video action recognition by incorporating multi-scale regional correlations, achieving state-of-the-art results with reduced computational cost.
Contribution
The PNL module extends the non-local block with a pyramid structure to efficiently model regional correlations at multiple scales in videos.
Findings
Achieves 83.09% accuracy on Mini-Kinetics dataset.
Reduces computation cost compared to traditional non-local blocks.
Demonstrates improved effectiveness in capturing long-range dependencies.
Abstract
Long-range spatiotemporal dependencies capturing plays an essential role in improving video features for action recognition. The non-local block inspired by the non-local means is designed to address this challenge and have shown excellent performance. However, the non-local block brings significant increase in computation cost to the original network. It also lacks the ability to model regional correlation in videos. To address the above limitations, we propose Pyramid Non-Local (PNL) module, which extends the non-local block by incorporating regional correlation at multiple scales through a pyramid structured module. This extension upscales the effectiveness of non-local operation by attending to the interaction between different regions. Empirical results prove the effectiveness and efficiency of our PNL module, which achieves state-of-the-art performance of 83.09% on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods
Methods1x1 Convolution · Residual Connection · Non-Local Operation · Non-Local Block
