Value Function Based Difference-of-Convex Algorithm for Bilevel   Hyperparameter Selection Problems

Lucy Gao; Jane J. Ye; Haian Yin; Shangzhi Zeng; Jin Zhang

arXiv:2206.05976·math.OC·June 14, 2022·1 cites

Value Function Based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems

Lucy Gao, Jane J. Ye, Haian Yin, Shangzhi Zeng, Jin Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces VF-iDCA, a novel algorithm that finds stationary solutions for bilevel hyperparameter tuning problems without requiring strong convexity or smoothness assumptions, outperforming existing methods.

Contribution

The paper develops a new Value Function based Difference-of-Convex Algorithm with inexactness (VF-iDCA) that converges for a broad class of hyperparameter tuning problems without traditional convexity assumptions.

Findings

01

VF-iDCA achieves stationary solutions without LLSC and LLS assumptions.

02

Experimental results show VF-iDCA outperforms existing hyperparameter tuning methods.

03

Theoretical analysis confirms convergence properties of VF-iDCA.

Abstract

Gradient-based optimization methods for hyperparameter tuning guarantee theoretical convergence to stationary solutions when for fixed upper-level variable values, the lower level of the bilevel program is strongly convex (LLSC) and smooth (LLS). This condition is not satisfied for bilevel programs arising from tuning hyperparameters in many machine learning algorithms. In this work, we develop a sequentially convergent Value Function based Difference-of-Convex Algorithm with inexactness (VF-iDCA). We show that this algorithm achieves stationary solutions without LLSC and LLS assumptions for bilevel programs from a broad class of hyperparameter tuning applications. Our extensive experiments confirm our theoretical findings and show that the proposed VF-iDCA yields superior performance when applied to tune hyperparameters.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sustech-optimization/vf-idca
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Risk and Portfolio Optimization