BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL

Yu-Heng Hung; Kai-Jie Lin; Yu-Heng Lin; Chien-Yi Wang; Cheng Sun; Ping-Chun Hsieh

arXiv:2505.21974·cs.LG·May 30, 2025

BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL

Yu-Heng Hung, Kai-Jie Lin, Yu-Heng Lin, Chien-Yi Wang, Cheng Sun, Ping-Chun Hsieh

PDF

Open Access 1 Video 3 Reviews

TL;DR

BOFormer introduces a sequence modeling approach using deep Q-learning to improve multi-objective Bayesian optimization, effectively addressing hypervolume identifiability issues and outperforming existing algorithms in diverse tasks.

Contribution

This paper proposes BOFormer, a novel non-Markovian RL framework with Transformers for multi-objective Bayesian optimization, overcoming hypervolume identifiability challenges.

Findings

01

BOFormer consistently outperforms benchmark algorithms in synthetic MOBO tasks.

02

BOFormer achieves superior results in real-world multi-objective hyperparameter optimization.

03

The source code for BOFormer is publicly available for further research.

Abstract

Bayesian optimization (BO) offers an efficient pipeline for optimizing black-box functions with the help of a Gaussian process prior and an acquisition function (AF). Recently, in the context of single-objective BO, learning-based AFs witnessed promising empirical results given its favorable non-myopic nature. Despite this, the direct extension of these approaches to multi-objective Bayesian optimization (MOBO) suffer from the \textit{hypervolume identifiability issue}, which results from the non-Markovian nature of MOBO problems. To tackle this, inspired by the non-Markovian RL literature and the success of Transformers in language modeling, we present a generalized deep Q-learning framework and propose \textit{BOFormer}, which substantiates this framework for MOBO via sequence modeling. Through extensive evaluation, we demonstrate that BOFormer constantly outperforms the benchmark…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 5Confidence 3

Strengths

The paper introduces an innovative approach by framing multi-objective Bayesian optimization (MOBO) as a non-Markovian reinforcement learning problem. This represents a creative combination of existing ideas from non-Markovian RL and Transformer-based sequence modeling, marking a fresh perspective in the field. The use of diagrams and examples, such as the hypervolume identifiability issue, aids in understanding complex concepts.

Weaknesses

1. The discussion of shortcomings in the paper is relatively brief and does not clearly articulate the innovative aspects of the work. Furthermore, the contributions appear to be somewhat incremental rather than groundbreaking. 2. The use of Transformers may require substantial computational resources and memory, which could limit accessibility for some users. 3. The explanations regarding the experimental section lack clarity. The paper does not specify how the proposed algorithm's time efficie

Reviewer 02Rating 6Confidence 3

Strengths

One of this paper's key strengths is its novel approach to addressing the hypervolume identifiability issue in multi-objective Bayesian optimization (MOBO). By presenting the Generalized DQN framework and implementing it through BOFormer, the authors tackle MOBO's inherent non-Markovian nature. This innovative perspective of reinterpreting MOBO as a sequence modeling problem using Transformers allows for a more effective and efficient solution to the identifiability issue. Another strength lies

Weaknesses

Limited theoretical analysis: Although the paper introduces the Generalized DQN framework and provides empirical evidence of its effectiveness, it lacks an in-depth theoretical analysis of the proposed approach. A more rigorous theoretical foundation could help better understand BOFormer's convergence properties, optimality guarantees, and limitations. Scalability to high-dimensional problems: While BOFormer performs well on the tested problems, its scalability to high-dimensional MOBO problems

Reviewer 03Rating 6Confidence 3

Strengths

The authors have identified and thoroughly addressed a fundamental issue in learning-based MOBO approaches - the hypervolume identifiability problem. This represents a significant contribution to the field. The theoretical framework connecting non-Markovian RL to MOBO is rigorously developed and mathematically sound. The experimental evaluation is comprehensive, comparing the method against both classical and learning-based baselines across diverse scenarios.

Weaknesses

The contribution appears somewhat incremental relative to existing approaches like OptFormer and NAP. While the authors introduce novel elements, the core methodology builds heavily on established techniques. The motivation for using Transformers in this context needs stronger justification. Given the small data regime typical in Bayesian optimization, the choice of a Transformer architecture, which typically requires substantial data for effective training, requires more thorough explanation.

Videos

BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL· slideslive

Taxonomy

TopicsAdvanced Multi-Objective Optimization Algorithms · Gaussian Processes and Bayesian Inference · Machine Learning and Data Classification

MethodsGaussian Process · Q-Learning