Differentially Private Bilevel Optimization

Guy Kornowski

arXiv:2409.19800·cs.LG·January 15, 2026

Differentially Private Bilevel Optimization

Guy Kornowski

PDF

Open Access 3 Reviews

TL;DR

This paper introduces the first differentially private algorithms for bilevel optimization that avoid Hessian computations, applicable to large-scale machine learning problems with theoretical guarantees on hypergradient accuracy.

Contribution

The authors develop the first DP algorithms for bilevel optimization that do not require Hessian calculations, with analysis covering various problem settings and practical hyperparameter tuning.

Findings

01

Achieves hypergradient norm at most $ ilde{O}((rac{ oot{d_ ext{up}}}{ ext{epsilon} imes n})^{1/2}+(rac{ oot{d_ ext{low}}}{ ext{epsilon} imes n})^{1/3})$

02

Applicable to constrained and unconstrained problems, with mini-batch and empirical loss considerations

03

Provides a simple private rule for tuning regularization hyperparameters.

Abstract

We present differentially private (DP) algorithms for bilevel optimization, a problem class that received significant attention lately in various machine learning applications. These are the first algorithms for such problems under standard DP constraints, and are also the first to avoid Hessian computations which are prohibitive in large-scale settings. Under the well-studied setting in which the upper-level is not necessarily convex and the lower-level problem is strongly-convex, our proposed gradient-based $(ϵ, δ)$ -DP algorithm returns a point with hypergradient norm at most $O ((d_{up} / ϵ n)^{1/2} + (d_{low} / ϵ n)^{1/3})$ where $n$ is the dataset size, and $d_{up} / d_{low}$ are the upper/lower level dimensions. Our analysis covers constrained and unconstrained problems alike, accounts for…

Peer Reviews

Decision·ALT 2026

Reviewer 01Rating 6Confidence 2

Strengths

- The paper studies bilevel optimization under central DP, establishing first results in the area. - It provides a mini-batch variant and addresses both ERM and population risks. - The paper has a well organized structure.

Weaknesses

See questions below

Reviewer 02Rating 5Confidence 3

Strengths

This framework can work with different inner algorithms with only dependency on its convergence rate and DP parameters.

Weaknesses

While the methods outlined in the paper appear innovative, they lack a clear comparative analysis with existing methods. Fully first-order methods have already been established in non-DP settings; however, it's not apparent whether the DP version introduces significant additional complexities. The paper lacks empirical evaluation, which is noted as future work. This omission is unconventional and limits the ability to gauge practical effectiveness. Exploring the interaction between outer and i

Reviewer 03Rating 8Confidence 4

Strengths

This submission provides a clear contribution. Private bilevel optimization is certainly worthy of study. The paper is written well.

Weaknesses

The submission is not very deep: once the problem is stated and we've decided to following the non-private first-order penalty methods, the analysis strikes me as essentially a process of assembling the right tools and carefully applying them and tracking the error. (I don't mean to imply that this is trivial, just that the paper would appeal to a wider audience if it had new ideas for private optimization. Maybe it does, and I wasn't able to pick them up?)

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic processes and financial applications · Advanced Banach Space Theory · Economic theories and models

MethodsSoftmax · Attention Is All You Need