Learning to Generalize Provably in Learning to Optimize
Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, Dacheng Tao,, Yingbin Liang, Zhangyang Wang

TL;DR
This paper introduces a novel approach to improve the generalization of learned optimizers and optimizees by incorporating flatness-aware regularizers based on local entropy and Hessian metrics, supported by theoretical analysis and extensive experiments.
Contribution
It proposes a new method to enhance generalization in learning to optimize by integrating flatness metrics into the meta-training process, with theoretical justification and empirical validation.
Findings
Significantly improved generalization on multiple models and optimizees.
Theoretical connection between local entropy and Hessian for landscape analysis.
Effective incorporation of flatness-aware regularizers into L2O framework.
Abstract
Learning to optimize (L2O) has gained increasing popularity, which automates the design of optimizers by data-driven approaches. However, current L2O methods often suffer from poor generalization performance in at least two folds: (i) applying the L2O-learned optimizer to unseen optimizees, in terms of lowering their loss function values (optimizer generalization, or ``generalizable learning of optimizers"); and (ii) the test performance of an optimizee (itself as a machine learning model), trained by the optimizer, in terms of the accuracy over unseen data (optimizee generalization, or ``learning to generalize"). While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper. We first theoretically establish an implicit connection between the local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Multi-Objective Optimization Algorithms · Human Pose and Action Recognition
MethodsTest
