Scalable3-BO: Big Data meets HPC - A scalable asynchronous parallel high-dimensional Bayesian optimization framework on supercomputers
Anh Tran

TL;DR
This paper introduces Scalable3-BO, a scalable Bayesian optimization framework designed for high-dimensional, large-scale data on supercomputers, combining sparse Gaussian processes, random embedding, and asynchronous parallelization to improve efficiency and scalability.
Contribution
The paper presents a novel scalable Bayesian optimization framework that integrates sparse GPs, random embedding, and asynchronous parallelization to handle big data, high dimensions, and HPC resources simultaneously.
Findings
Optimized high-dimensional problems with 10,000 dimensions.
Handled 1 million data points efficiently.
Achieved 20 concurrent workers in HPC environment.
Abstract
Bayesian optimization (BO) is a flexible and powerful framework that is suitable for computationally expensive simulation-based applications and guarantees statistical convergence to the global optimum. While remaining as one of the most popular optimization methods, its capability is hindered by the size of data, the dimensionality of the considered problem, and the nature of sequential optimization. These scalability issues are intertwined with each other and must be tackled simultaneously. In this work, we propose the Scalable-BO framework, which employs sparse GP as the underlying surrogate model to scope with Big Data and is equipped with a random embedding to efficiently optimize high-dimensional problems with low effective dimensionality. The Scalable-BO framework is further leveraged with asynchronous parallelization feature, which fully exploits the computational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
