Learning to Generalize Provably in Learning to Optimize

Junjie Yang; Tianlong Chen; Mingkang Zhu; Fengxiang He; Dacheng Tao,; Yingbin Liang; Zhangyang Wang

arXiv:2302.11085·cs.LG·March 29, 2023·1 cites

Learning to Generalize Provably in Learning to Optimize

Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, Dacheng Tao,, Yingbin Liang, Zhangyang Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel approach to improve the generalization of learned optimizers and optimizees by incorporating flatness-aware regularizers based on local entropy and Hessian metrics, supported by theoretical analysis and extensive experiments.

Contribution

It proposes a new method to enhance generalization in learning to optimize by integrating flatness metrics into the meta-training process, with theoretical justification and empirical validation.

Findings

01

Significantly improved generalization on multiple models and optimizees.

02

Theoretical connection between local entropy and Hessian for landscape analysis.

03

Effective incorporation of flatness-aware regularizers into L2O framework.

Abstract

Learning to optimize (L2O) has gained increasing popularity, which automates the design of optimizers by data-driven approaches. However, current L2O methods often suffer from poor generalization performance in at least two folds: (i) applying the L2O-learned optimizer to unseen optimizees, in terms of lowering their loss function values (optimizer generalization, or ``generalizable learning of optimizers"); and (ii) the test performance of an optimizee (itself as a machine learning model), trained by the optimizer, in terms of the accuracy over unseen data (optimizee generalization, or ``learning to generalize"). While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper. We first theoretically establish an implicit connection between the local…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

VITA-Group/Open-L2O
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Multi-Objective Optimization Algorithms · Human Pose and Action Recognition

MethodsTest