Hyperparameter Tuning Through Pessimistic Bilevel Optimization

Meltem Apaydin Ustun; Liang Xu; Bo Zeng; Xiaoning Qian

arXiv:2412.03666·cs.LG·December 6, 2024

Hyperparameter Tuning Through Pessimistic Bilevel Optimization

Meltem Apaydin Ustun, Liang Xu, Bo Zeng, Xiaoning Qian

PDF

Open Access

TL;DR

This paper introduces a pessimistic bilevel hyperparameter optimization method that explicitly accounts for model uncertainty, leading to more robust hyperparameters and improved generalization in limited or perturbed data scenarios.

Contribution

It proposes a novel pessimistic bilevel optimization framework for hyperparameter tuning that considers uncertainty, along with a relaxation-based approximation method to solve it.

Findings

01

Pessimistic solutions outperform optimistic ones with limited training data.

02

Pessimistic approach yields more robust prediction models.

03

Empirical results show improved generalization under data perturbations.

Abstract

Automated hyperparameter search in machine learning, especially for deep learning models, is typically formulated as a bilevel optimization problem, with hyperparameter values determined by the upper level and the model learning achieved by the lower-level problem. Most of the existing bilevel optimization solutions either assume the uniqueness of the optimal training model given hyperparameters or adopt an optimistic view when the non-uniqueness issue emerges. Potential model uncertainty may arise when training complex models with limited data, especially when the uniqueness assumption is violated. Thus, the suitability of the optimistic view underlying current bilevel hyperparameter optimization solutions is questionable. In this paper, we propose pessimistic bilevel hyperparameter optimization to assure appropriate outer-level hyperparameters to better generalize the inner-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Advanced Multi-Objective Optimization Algorithms · Advanced Optimization Algorithms Research

MethodsADaptive gradient method with the OPTimal convergence rate