Distributed Bilevel Optimization with Dual Pruning for Resource-limited Clients
Mingyi Li, Xiao Zhang, Ruisheng Zheng, Hongjian Shi, Yuan Yuan, Xiuzhen Cheng, and Dongxiao Yu

TL;DR
This paper introduces a resource-adaptive distributed bilevel optimization framework enabling low-resource clients to efficiently optimize models by reducing computation and providing theoretical convergence guarantees.
Contribution
It proposes the first resource-adaptive bilevel optimization framework with a second-order free hypergradient estimator for resource-limited clients.
Findings
Achieves an asymptotically optimal convergence rate of O(1/√(C_x*Q)).
Demonstrates effectiveness and efficiency through extensive experiments.
Supports resource-limited clients in distributed bilevel optimization.
Abstract
With the development of large-scale models, traditional distributed bilevel optimization algorithms cannot be applied directly in low-resource clients. The key reason lies in the excessive computation involved in optimizing both the lower- and upper-level functions. Thus, we present the first resource-adaptive distributed bilevel optimization framework with a second-order free hypergradient estimator, which allows each client to optimize the submodels adapted to the available resources. Due to the coupled influence of partial outer parameters x and inner parameters y, it's challenging to theoretically analyze the upper bound regarding the globally averaged hypergradient for full model parameters. The error bound of inner parameter also needs to be reformulated since the local partial training. The provable theorems show that both RABO and RAFBO can achieve an asymptotically optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Optimization and Variational Analysis · Sparse and Compressive Sensing Techniques
