Stochastic Hyperparameter Optimization through Hypernetworks
Jonathan Lorraine, David Duvenaud

TL;DR
This paper introduces a neural network-based method to jointly optimize model weights and hyperparameters, simplifying the tuning process and effectively handling thousands of hyperparameters.
Contribution
It presents a novel approach that collapses nested hyperparameter and weight optimization into a single stochastic process using hypernetworks.
Findings
Converges to locally optimal weights and hyperparameters with large hypernetworks.
Effective in tuning thousands of hyperparameters.
Outperforms standard hyperparameter optimization strategies.
Abstract
Machine learning models are often tuned by nesting optimization of model weights inside the optimization of hyperparameters. We give a method to collapse this nested optimization into joint stochastic optimization of weights and hyperparameters. Our process trains a neural network to output approximately optimal weights as a function of hyperparameters. We show that our technique converges to locally optimal weights and hyperparameters for sufficiently large hypernetworks. We compare this method to standard hyperparameter optimization strategies and demonstrate its effectiveness for tuning thousands of hyperparameters.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Multi-Objective Optimization Algorithms · Neural Networks and Applications
MethodsHyperNetwork
