Meta-SGD: Learning to Learn Quickly for Few-Shot Learning
Zhenguo Li, Fengwei Zhou, Fei Chen, Hang Li

TL;DR
Meta-SGD is a novel meta-learning algorithm that quickly adapts any differentiable learner in one step, outperforming previous methods in few-shot learning tasks across various domains.
Contribution
The paper introduces Meta-SGD, a simple, efficient, and more capable meta-learner that learns to initialize and update learners in a single process, improving few-shot learning performance.
Findings
Meta-SGD outperforms LSTM-based meta-learners in speed and simplicity.
Meta-SGD achieves higher accuracy in few-shot regression, classification, and reinforcement learning.
Meta-SGD learns both initialization and update rules, enhancing adaptability.
Abstract
Few-shot learning is challenging for learning algorithms that learn each task in isolation and from scratch. In contrast, meta-learning learns from many related tasks a meta-learner that can learn a new task more accurately and faster with fewer examples, where the choice of meta-learners is crucial. In this paper, we develop Meta-SGD, an SGD-like, easily trainable meta-learner that can initialize and adapt any differentiable learner in just one step, on both supervised learning and reinforcement learning. Compared to the popular meta-learner LSTM, Meta-SGD is conceptually simpler, easier to implement, and can be learned more efficiently. Compared to the latest meta-learner MAML, Meta-SGD has a much higher capacity by learning to learn not just the learner initialization, but also the learner update direction and learning rate, all in a single meta-learning process. Meta-SGD shows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Machine Learning and Data Classification
MethodsSigmoid Activation · Tanh Activation · Model-Agnostic Meta-Learning · Long Short-Term Memory
