Relax and penalize: a new bilevel approach to mixed-binary   hyperparameter optimization

Sara Venturini; Marianna de Santis (UNIROMA); Jordan Patracone; (MALICE); Francesco Rinaldi (Unipd); Saverio Salzo (DIAG UNIROMA); Martin; Schmidt

arXiv:2308.10711·cs.LG·March 20, 2025

Relax and penalize: a new bilevel approach to mixed-binary hyperparameter optimization

Sara Venturini, Marianna de Santis (UNIROMA), Jordan Patracone, (MALICE), Francesco Rinaldi (Unipd), Saverio Salzo (DIAG UNIROMA), Martin, Schmidt

PDF

Open Access

TL;DR

This paper introduces a novel bilevel optimization method for mixed-binary hyperparameters in machine learning, using a continuous reformulation with penalties to ensure consistent solutions, and demonstrates competitive results in two applications.

Contribution

The paper presents a new bilevel approach that directly handles mixed-binary hyperparameters via a continuous reformulation with penalties, avoiding relaxation and rounding issues.

Findings

01

Method is guaranteed to produce mixed-binary solutions under certain conditions.

02

Approach is compatible with existing continuous bilevel solvers.

03

Achieves competitive performance on hyperparameter estimation tasks.

Abstract

In recent years, bilevel approaches have become very popular to efficiently estimate high-dimensional hyperparameters of machine learning models. However, to date, binary parameters are handled by continuous relaxation and rounding strategies, which could lead to inconsistent solutions. In this context, we tackle the challenging optimization of mixed-binary hyperparameters by resorting to an equivalent continuous bilevel reformulation based on an appropriate penalty term. We propose an algorithmic framework that, under suitable assumptions, is guaranteed to provide mixed-binary solutions. Moreover, the generality of the method allows to safely use existing continuous bilevel solvers within the proposed framework. We evaluate the performance of our approach for two specific machine learning problems, i.e., the estimation of the group-sparsity structure in regression problems and the data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Domain Adaptation and Few-Shot Learning · Statistical Methods and Inference