Stochastic Hyperparameter Optimization through Hypernetworks

Jonathan Lorraine; David Duvenaud

arXiv:1802.09419·cs.LG·March 9, 2018·87 cites

Stochastic Hyperparameter Optimization through Hypernetworks

Jonathan Lorraine, David Duvenaud

PDF

Open Access 1 Repo

TL;DR

This paper introduces a neural network-based method to jointly optimize model weights and hyperparameters, simplifying the tuning process and effectively handling thousands of hyperparameters.

Contribution

It presents a novel approach that collapses nested hyperparameter and weight optimization into a single stochastic process using hypernetworks.

Findings

01

Converges to locally optimal weights and hyperparameters with large hypernetworks.

02

Effective in tuning thousands of hyperparameters.

03

Outperforms standard hyperparameter optimization strategies.

Abstract

Machine learning models are often tuned by nesting optimization of model weights inside the optimization of hyperparameters. We give a method to collapse this nested optimization into joint stochastic optimization of weights and hyperparameters. Our process trains a neural network to output approximately optimal weights as a function of hyperparameters. We show that our technique converges to locally optimal weights and hyperparameters for sufficiently large hypernetworks. We compare this method to standard hyperparameter optimization strategies and demonstrate its effectiveness for tuning thousands of hyperparameters.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lorraine2/hypernet-hypertraining
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Multi-Objective Optimization Algorithms · Neural Networks and Applications

MethodsHyperNetwork