Decentralized Stochastic Bilevel Optimization with Improved   per-Iteration Complexity

Xuxing Chen; Minhui Huang; Shiqian Ma; Krishnakumar Balasubramanian

arXiv:2210.12839·math.OC·June 2, 2023·1 cites

Decentralized Stochastic Bilevel Optimization with Improved per-Iteration Complexity

Xuxing Chen, Minhui Huang, Shiqian Ma, Krishnakumar Balasubramanian

PDF

Open Access 1 Video

TL;DR

This paper introduces a decentralized stochastic bilevel optimization algorithm that achieves optimal sample complexity without estimating full Hessian or Jacobian matrices, improving per-iteration efficiency in machine learning tasks.

Contribution

It presents a novel DSBO algorithm that matches best known sample complexity while reducing per-iteration computational costs by avoiding full Hessian and Jacobian matrix estimations.

Findings

01

Matches best known sample complexity for DSBO

02

Reduces per-iteration complexity by avoiding full Hessian/Jacobian estimation

03

Requires only first-order stochastic, Hessian-vector, and Jacobian-vector oracles

Abstract

Bilevel optimization recently has received tremendous attention due to its great success in solving important machine learning problems like meta learning, reinforcement learning, and hyperparameter optimization. Extending single-agent training on bilevel problems to the decentralized setting is a natural generalization, and there has been a flurry of work studying decentralized bilevel optimization algorithms. However, it remains unknown how to design the distributed algorithm with sample complexity and convergence rate comparable to SGD for stochastic optimization, and at the same time without directly computing the exact Hessian or Jacobian matrices. In this paper we propose such an algorithm. More specifically, we propose a novel decentralized stochastic bilevel optimization (DSBO) algorithm that only requires first order stochastic oracle, Hessian-vector product and Jacobian-vector…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Decentralized Stochastic Bilevel Optimization with Improved per-Iteration Complexity· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Privacy-Preserving Technologies in Data

MethodsStochastic Gradient Descent