Convergence of First-Order Algorithms for Meta-Learning with Moreau   Envelopes

Konstantin Mishchenko; Slavom\'ir Hanzely; Peter Richt\'arik

arXiv:2301.06806·math.OC·January 18, 2023

Convergence of First-Order Algorithms for Meta-Learning with Moreau Envelopes

Konstantin Mishchenko, Slavom\'ir Hanzely, Peter Richt\'arik

PDF

Open Access

TL;DR

This paper develops a theoretical framework for the convergence of first-order algorithms in meta-learning, specifically focusing on Moreau envelope minimization, and improves upon existing inexact SGD guarantees without requiring Hessian smoothness.

Contribution

It introduces a convergence theory for first-order meta-learning algorithms using Moreau envelopes, with tighter guarantees and no Hessian smoothness assumptions, extending the understanding of FO-MAML and related methods.

Findings

01

Proves convergence of FO-MAML to the vicinity of a solution.

02

Provides tighter guarantees with improved dependency on problem conditioning.

03

Shows iMAML objective lacks smoothness and convexity, limiting convergence guarantees.

Abstract

In this work, we consider the problem of minimizing the sum of Moreau envelopes of given functions, which has previously appeared in the context of meta-learning and personalized federated learning. In contrast to the existing theory that requires running subsolvers until a certain precision is reached, we only assume that a finite number of gradient steps is taken at each iteration. As a special case, our theory allows us to show the convergence of First-Order Model-Agnostic Meta-Learning (FO-MAML) to the vicinity of a solution of Moreau objective. We also study a more general family of first-order algorithms that can be viewed as a generalization of FO-MAML. Our main theoretical achievement is a theoretical improvement upon the inexact SGD framework. In particular, our perturbed-iterate analysis allows for tighter guarantees that improve the dependency on the problem's conditioning.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Cancer-related molecular mechanisms research · Machine Learning and ELM

MethodsStochastic Gradient Descent · Model-Agnostic Meta-Learning