Accelerating Bayesian Optimization for Biological Sequence Design with Denoising Autoencoders
Samuel Stanton, Wesley Maddox, Nate Gruver, Phillip Maffettone, Emily, Delaney, Peyton Greenside, Andrew Gordon Wilson

TL;DR
This paper introduces LaMBO, a novel Bayesian optimization method that leverages denoising autoencoders and Gaussian processes to efficiently optimize biological sequences for drug and protein design, overcoming high-dimensional discrete challenges.
Contribution
LaMBO is the first approach to jointly train autoencoders with Gaussian processes for gradient-based multi-objective optimization in biological sequence design.
Findings
LaMBO outperforms genetic algorithms in sequence optimization tasks.
It effectively balances exploration and exploitation in multi-objective settings.
Does not require large pretraining datasets, making it practical for biological applications.
Abstract
Bayesian optimization (BayesOpt) is a gold standard for query-efficient continuous optimization. However, its adoption for drug design has been hindered by the discrete, high-dimensional nature of the decision variables. We develop a new approach (LaMBO) which jointly trains a denoising autoencoder with a discriminative multi-task Gaussian process head, allowing gradient-based optimization of multi-objective acquisition functions in the latent space of the autoencoder. These acquisition functions allow LaMBO to balance the explore-exploit tradeoff over multiple design rounds, and to balance objective tradeoffs by optimizing sequences at many different points on the Pareto frontier. We evaluate LaMBO on two small-molecule design tasks, and introduce new tasks optimizing \emph{in silico} and \emph{in vitro} properties of large-molecule fluorescent proteins. In our experiments LaMBO…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Protein Structure and Dynamics · Machine Learning in Materials Science
MethodsGaussian Process · Denoising Autoencoder
