Hessian-Free Distributed Bilevel Optimization via Penalization with Time-Scale Separation

Youcheng Niu; Jinming Xu; Ying Sun; Li Chai; Jiming Chen

arXiv:2412.11218·math.OC·February 27, 2026

Hessian-Free Distributed Bilevel Optimization via Penalization with Time-Scale Separation

Youcheng Niu, Jinming Xu, Ying Sun, Li Chai, Jiming Chen

PDF

TL;DR

This paper introduces AHEAD, a loopless distributed algorithm for bilevel optimization that avoids Hessian computations by using a penalized minimax reformulation and multiple-timescale updates, with proven convergence.

Contribution

It proposes a novel distributed bilevel optimization method that employs penalization and time-scale separation, eliminating the need for Hessian evaluations and providing convergence guarantees.

Findings

01

Convergence rates are established for nonconvex-strongly-convex settings.

02

Performance depends on node heterogeneity, penalty parameters, and network connectivity.

03

Numerical experiments validate theoretical results.

Abstract

This paper considers a class of distributed bilevel optimization (DBO) problems with a coupled inner-level subproblem. Existing approaches typically rely on hypergradient estimations involving computationally expensive Hessian evaluation. To address this, we approximate the DBO problem as a minimax problem by properly designing a penalty term that enforces both the constraint imposed by the inner-level subproblem and the consensus among the decision variables of agents. Moreover, we propose a loopless distributed algorithm, AHEAD, that employs multiple-timescale updates to solve the approximate problem asymptotically without requiring Hessian computation. Theoretically, we establish sharp convergence rates for nonconvex-strongly-convex settings and for distributed minimax problems as special cases. Our analysis reveals a clear dependence of convergence performance on node heterogeneity,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.