Scalable Nested Optimization for Deep Learning
Jonathan Lorraine

TL;DR
This paper introduces scalable methods for nested optimization in deep learning, addressing challenges in hyperparameter tuning and GAN training by extending classical approaches to large-scale problems.
Contribution
It develops new tools for efficient nested optimization tailored for large-scale deep learning applications, overcoming limitations of traditional methods.
Findings
Scalable nested optimization methods outperform classical approaches.
Effective in hyperparameter optimization and GAN training.
Addresses large-scale deep learning challenges.
Abstract
Gradient-based optimization has been critical to the success of machine learning, updating a single set of parameters to minimize a single loss. A growing number of applications rely on a generalization of this, where we have a bilevel or nested optimization of which subsets of parameters update on different objectives nested inside each other. We focus on motivating examples of hyperparameter optimization and generative adversarial networks. However, naively applying classical methods often fails when we look at solving these nested problems on a large scale. In this thesis, we build tools for nested optimization that scale to deep learning setups.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems Design Techniques
MethodsSparse Evolutionary Training · Focus
