RTDK-BO: High Dimensional Bayesian Optimization with Reinforced   Transformer Deep kernels

Alexander Shmakov; Avisek Naug; Vineet Gundecha; Sahand Ghorbanpour,; Ricardo Luna Gutierrez; Ashwin Ramesh Babu; Antonio Guillen; Soumyendu; Sarkar

arXiv:2310.03912·cs.LG·November 9, 2023

RTDK-BO: High Dimensional Bayesian Optimization with Reinforced Transformer Deep kernels

Alexander Shmakov, Avisek Naug, Vineet Gundecha, Sahand Ghorbanpour,, Ricardo Luna Gutierrez, Ashwin Ramesh Babu, Antonio Guillen, Soumyendu, Sarkar

PDF

TL;DR

This paper introduces RTDK-BO, a novel high-dimensional Bayesian Optimization method that integrates Transformer-based deep kernels and reinforcement learning to enhance surrogate modeling and exploration, achieving state-of-the-art results.

Contribution

It combines attention-based Transformer models with deep kernel learning and reinforcement learning to improve meta-learning in high-dimensional Bayesian Optimization.

Findings

01

Achieves state-of-the-art performance on high-dimensional problems.

02

Effectively models shared similarities between related objectives.

03

Enhances exploration with reinforcement learning-based acquisition functions.

Abstract

Bayesian Optimization (BO), guided by Gaussian process (GP) surrogates, has proven to be an invaluable technique for efficient, high-dimensional, black-box optimization, a critical problem inherent to many applications such as industrial design and scientific computing. Recent contributions have introduced reinforcement learning (RL) to improve the optimization performance on both single function optimization and \textit{few-shot} multi-objective optimization. However, even few-shot techniques fail to exploit similarities shared between closely related objectives. In this paper, we combine recent developments in Deep Kernel Learning (DKL) and attention-based Transformer models to improve the modeling powers of GP surrogates with meta-learning. We propose a novel method for improving meta-learning BO surrogates by incorporating attention mechanisms into DKL, empowering the surrogates to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDeep Kernel Learning · Attention Is All You Need · Byte Pair Encoding · Dense Connections · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection · Layer Normalization · Linear Layer