RTDK-BO: High Dimensional Bayesian Optimization with Reinforced Transformer Deep kernels
Alexander Shmakov, Avisek Naug, Vineet Gundecha, Sahand Ghorbanpour,, Ricardo Luna Gutierrez, Ashwin Ramesh Babu, Antonio Guillen, Soumyendu, Sarkar

TL;DR
This paper introduces RTDK-BO, a novel high-dimensional Bayesian Optimization method that integrates Transformer-based deep kernels and reinforcement learning to enhance surrogate modeling and exploration, achieving state-of-the-art results.
Contribution
It combines attention-based Transformer models with deep kernel learning and reinforcement learning to improve meta-learning in high-dimensional Bayesian Optimization.
Findings
Achieves state-of-the-art performance on high-dimensional problems.
Effectively models shared similarities between related objectives.
Enhances exploration with reinforcement learning-based acquisition functions.
Abstract
Bayesian Optimization (BO), guided by Gaussian process (GP) surrogates, has proven to be an invaluable technique for efficient, high-dimensional, black-box optimization, a critical problem inherent to many applications such as industrial design and scientific computing. Recent contributions have introduced reinforcement learning (RL) to improve the optimization performance on both single function optimization and \textit{few-shot} multi-objective optimization. However, even few-shot techniques fail to exploit similarities shared between closely related objectives. In this paper, we combine recent developments in Deep Kernel Learning (DKL) and attention-based Transformer models to improve the modeling powers of GP surrogates with meta-learning. We propose a novel method for improving meta-learning BO surrogates by incorporating attention mechanisms into DKL, empowering the surrogates to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDeep Kernel Learning · Attention Is All You Need · Byte Pair Encoding · Dense Connections · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection · Layer Normalization · Linear Layer
