CLTA: Contents and Length-based Temporal Attention for Few-shot Action Recognition
Yang Bo, Yangdi Lu, Wenbo He

TL;DR
This paper introduces CLTA, a novel temporal attention model that enhances few-shot action recognition by learning customized attention based on video content and length, outperforming existing methods.
Contribution
The paper proposes a new temporal attention mechanism that considers both content and length, improving few-shot action recognition without requiring fine-tuning.
Findings
CLTA achieves comparable or better results than state-of-the-art methods.
The model effectively captures temporal information using Gaussian likelihood-based attention.
Performance is maintained even with a non-fine-tuned backbone.
Abstract
Few-shot action recognition has attracted increasing attention due to the difficulty in acquiring the properly labelled training samples. Current works have shown that preserving spatial information and comparing video descriptors are crucial for few-shot action recognition. However, the importance of preserving temporal information is not well discussed. In this paper, we propose a Contents and Length-based Temporal Attention (CLTA) model, which learns customized temporal attention for the individual video to tackle the few-shot action recognition problem. CLTA utilizes the Gaussian likelihood function as the template to generate temporal attention and trains the learning matrices to study the mean and standard deviation based on both frame contents and length. We show that even a not fine-tuned backbone with an ordinary softmax classifier can still achieve similar or better results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Anomaly Detection Techniques and Applications
MethodsSoftmax
