Learning Representational Invariances for Data-Efficient Action Recognition
Yuliang Zou, Jinwoo Choi, Qitong Wang, Jia-Bin Huang

TL;DR
This paper explores diverse data augmentation strategies for videos to improve action recognition, demonstrating their effectiveness in low-label and fully supervised settings across multiple datasets.
Contribution
It introduces novel video data augmentation techniques capturing various invariances and integrates them with semi-supervised learning frameworks for enhanced performance.
Findings
Improved accuracy on Kinetics-100/400, Mini-Something-v2, UCF-101, HMDB-51 datasets.
Effective augmentation strategies for photometric, geometric, temporal, and actor/scene invariances.
Enhanced performance in both low-label and fully supervised regimes.
Abstract
Data augmentation is a ubiquitous technique for improving image classification when labeled data is scarce. Constraining the model predictions to be invariant to diverse data augmentations effectively injects the desired representational invariances to the model (e.g., invariance to photometric variations) and helps improve accuracy. Compared to image data, the appearance variations in videos are far more complex due to the additional temporal dimension. Yet, data augmentation methods for videos remain under-explored. This paper investigates various data augmentation strategies that capture different video invariances, including photometric, geometric, temporal, and actor/scene augmentations. When integrated with existing semi-supervised learning frameworks, we show that our data augmentation strategy leads to promising performance on the Kinetics-100/400, Mini-Something-v2, UCF-101,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications
