Semi Supervised Meta Learning for Spatiotemporal Learning
Faraz Waseem, Pratyush Muthukumar

TL;DR
This paper explores combining meta-learning with self-supervised masked autoencoders for improved spatiotemporal learning, testing various architectures including pre-trained models and memory-augmented networks.
Contribution
It introduces a novel approach integrating meta-learning with self-supervised autoencoders using MANN architecture for spatiotemporal tasks.
Findings
Meta-learning enhances spatiotemporal representation learning.
Pre-trained MAE combined with MANN improves action classification.
Different architecture combinations impact learning effectiveness.
Abstract
We approached the goal of applying meta-learning to self-supervised masked autoencoders for spatiotemporal learning in three steps. Broadly, we seek to understand the impact of applying meta-learning to existing state-of-the-art representation learning architectures. Thus, we test spatiotemporal learning through: a meta-learning architecture only, a representation learning architecture only, and an architecture applying representation learning alongside a meta learning architecture. We utilize the Memory Augmented Neural Network (MANN) architecture to apply meta-learning to our framework. Specifically, we first experiment with applying a pre-trained MAE and fine-tuning on our small-scale spatiotemporal dataset for video reconstruction tasks. Next, we experiment with training an MAE encoder and applying a classification head for action classification tasks. Finally, we experiment with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsMasked autoencoder
