Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization

Parvin Nazari; Bojian Hou; Davoud Ataee Tarzanagh; Li Shen; George Michailidis

arXiv:2511.01126·cs.LG·May 20, 2026

Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization

Parvin Nazari, Bojian Hou, Davoud Ataee Tarzanagh, Li Shen, George Michailidis

PDF

1 Video

TL;DR

This paper introduces stochastic algorithms for online bilevel optimization that achieve sublinear regret without window smoothing, improving efficiency and applicability in dynamic machine learning tasks.

Contribution

The work presents a novel search direction and stochastic algorithms for OBO that remove the need for window smoothing and enhance estimation efficiency.

Findings

01

Achieve sublinear stochastic bilevel regret without window smoothing

02

Reduce oracle dependence in hypergradient estimation

03

Validated on online loss tuning and adversarial attack tasks

Abstract

Online bilevel optimization (OBO) is a powerful framework for machine learning problems where both outer and inner objectives evolve over time, requiring dynamic updates. Current OBO approaches rely on deterministic \textit{window-smoothed} regret minimization, which may not accurately reflect system performance when functions change rapidly. In this work, we introduce a novel search direction and show that both first- and zeroth-order (ZO) stochastic OBO algorithms leveraging this direction achieve sublinear {stochastic bilevel regret without window smoothing}. Beyond these guarantees, our framework enhances efficiency by: (i) reducing oracle dependence in hypergradient estimation, (ii) updating inner and outer variables alongside the linear system solution, and (iii) employing ZO-based estimation of Hessians, Jacobians, and gradients. Experiments on online parametric loss tuning and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Sparse and Compressive Sensing Techniques