Meta-Learning by the Baldwin Effect

Chrisantha Thomas Fernando; Jakub Sygnowski; Simon Osindero; Jane; Wang; Tom Schaul; Denis Teplyashin; Pablo Sprechmann; Alexander Pritzel,; Andrei A. Rusu

arXiv:1806.07917·cs.NE·June 25, 2018

Meta-Learning by the Baldwin Effect

Chrisantha Thomas Fernando, Jakub Sygnowski, Simon Osindero, Jane, Wang, Tom Schaul, Denis Teplyashin, Pablo Sprechmann, Alexander Pritzel,, Andrei A. Rusu

PDF

TL;DR

This paper demonstrates that the Baldwin effect can evolve few-shot learning mechanisms in deep neural networks, shaping hyperparameters and initial weights, and compares its capabilities to MAML, highlighting its generality and flexibility.

Contribution

It shows that the Baldwin effect can be used to evolve few-shot learning strategies, offering a gradient-free alternative to MAML with broader applicability.

Findings

01

Baldwin effect can evolve hyperparameters and initial weights for deep learning.

02

It can accommodate strong learning biases on tasks similar to MAML.

03

Baldwin effect is more general and flexible than gradient-based meta-learning methods.

Abstract

The scope of the Baldwin effect was recently called into question by two papers that closely examined the seminal work of Hinton and Nowlan. To this date there has been no demonstration of its necessity in empirically challenging tasks. Here we show that the Baldwin effect is capable of evolving few-shot supervised and reinforcement learning mechanisms, by shaping the hyperparameters and the initial parameters of deep learning algorithms. Furthermore it can genetically accommodate strong learning biases on the same set of problems as a recent machine learning algorithm called MAML "Model Agnostic Meta-Learning" which uses second-order gradients instead of evolution to learn a set of reference parameters (initial weights) that can allow rapid adaptation to tasks sampled from a distribution. Whilst in simple cases MAML is more data efficient than the Baldwin effect, the Baldwin effect is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsModel-Agnostic Meta-Learning