# M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast   Self-Adaptation

**Authors:** Junjie Yang, Xuxi Chen, Tianlong Chen, Zhangyang Wang, Yingbin Liang

arXiv: 2303.00039 · 2023-03-02

## TL;DR

This paper introduces M-L2O, a meta-trained learning-to-optimize framework that enables rapid self-adaptation to out-of-distribution tasks with minimal steps, improving convergence speed over traditional L2O methods.

## Contribution

Proposes M-L2O, a meta-learning approach that enhances L2O's ability to quickly adapt to new, out-of-distribution tasks through test-time self-adaptation.

## Key findings

- M-L2O converges faster than vanilla L2O with only 5 adaptation steps.
- Theoretical analysis shows M-L2O facilitates rapid task adaptation by optimal initializations.
- Empirical results on LASSO and Quadratic tasks confirm the effectiveness of M-L2O.

## Abstract

Learning to Optimize (L2O) has drawn increasing attention as it often remarkably accelerates the optimization procedure of complex tasks by ``overfitting" specific task type, leading to enhanced performance compared to analytical optimizers. Generally, L2O develops a parameterized optimization method (i.e., ``optimizer") by learning from solving sample problems. This data-driven procedure yields L2O that can efficiently solve problems similar to those seen in training, that is, drawn from the same ``task distribution". However, such learned optimizers often struggle when new test problems come with a substantially deviation from the training task distribution. This paper investigates a potential solution to this open challenge, by meta-training an L2O optimizer that can perform fast test-time self-adaptation to an out-of-distribution task, in only a few steps. We theoretically characterize the generalization of L2O, and further show that our proposed framework (termed as M-L2O) provably facilitates rapid task adaptation by locating well-adapted initial points for the optimizer weight. Empirical observations on several classic tasks like LASSO and Quadratic, demonstrate that M-L2O converges significantly faster than vanilla L2O with only $5$ steps of adaptation, echoing our theoretical results. Codes are available in https://github.com/VITA-Group/M-L2O.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2303.00039/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/2303.00039/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/2303.00039/full.md

---
Source: https://tomesphere.com/paper/2303.00039