PIC: Permutation Invariant Convolution for Recognizing Long-range Activities
Noureldien Hussein, Efstratios Gavves, Arnold W.M. Smeulders

TL;DR
This paper introduces PIC, a permutation invariant convolutional layer designed to better recognize long-range activities by overcoming limitations of traditional neural operations like convolution, self-attention, and vector aggregation.
Contribution
The paper proposes a novel neural layer, PIC, that is permutation invariant, respects local connectivity, and uses shared weights, improving long-range activity recognition.
Findings
PIC outperforms existing methods on Charades, Breakfast, and MultiThumos datasets.
PIC effectively models weak temporal structures in long-range activities.
The approach demonstrates robustness to noisy video data.
Abstract
Neural operations as convolutions, self-attention, and vector aggregation are the go-to choices for recognizing short-range actions. However, they have three limitations in modeling long-range activities. This paper presents PIC, Permutation Invariant Convolution, a novel neural layer to model the temporal structure of long-range activities. It has three desirable properties. i. Unlike standard convolution, PIC is invariant to the temporal permutations of features within its receptive field, qualifying it to model the weak temporal structures. ii. Different from vector aggregation, PIC respects local connectivity, enabling it to learn long-range temporal abstractions using cascaded layers. iii. In contrast to self-attention, PIC uses shared weights, making it more capable of detecting the most discriminant visual evidence across long and noisy videos. We study the three properties of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods
MethodsConvolution
