# Generative Multiobjective Bayesian Optimization with Scalable Batch Evaluations for Sample-Efficient De Novo Molecular Design

**Authors:** Madhav R. Muthyala, Farshud Sorourifar, Tianhong Tan, You Peng, Joel A. Paulson

PMC · DOI: 10.1021/acs.iecr.5c03166 · 2025-12-21

## TL;DR

This paper introduces a new method for designing molecules that meet multiple goals efficiently, using a combination of generative models and optimization techniques.

## Contribution

The paper presents a novel 'generate-then-optimize' framework with a scalable acquisition function for multiobjective molecular design.

## Key findings

- The proposed method outperforms existing techniques on synthetic and real-world molecular design tasks.
- The approach successfully identifies diverse and high-performing organic cathode materials for energy storage.
- The qPMHI acquisition function enables efficient batch selection for Pareto front expansion.

## Abstract

Designing molecules
that must satisfy multiple, often
conflicting,
objectives is a central challenge in molecular discovery. The enormous
size of the chemical space and the cost of high-fidelity simulations
have driven the development of machine learning-guided strategies
for accelerating design with limited data. Among these, Bayesian optimization
(BO) offers a principled framework for sample-efficient search, while
generative models provide a mechanism to propose novel, diverse candidates
beyond fixed libraries. However, existing methods that couple the
two often rely on continuous latent spaces, which introduce both architectural
entanglement and scalability challenges. This work introduces an alternative,
modular “generate-then-optimize” framework for de novo
multiobjective molecular design/discovery. At each iteration, a generative
model is used to construct a large, diverse pool of candidate molecules,
after which a novel acquisition function, qPMHI (multipoint Probability
of Maximum Hypervolume Improvement), is used to optimally select a
batch of candidates most likely to induce the largest Pareto front
expansion. The key insight is that qPMHI decomposes additively, enabling
exact, scalable batch selection via only a simple ranking of probabilities
that can be easily estimated with Monte Carlo sampling. We benchmark
the framework against state-of-the-art latent-space and discrete molecular
optimization methods, demonstrating significant improvements across
synthetic benchmarks and application-driven tasks. Specifically, in
a case study related to sustainable energy storage, we show that our
approach quickly uncovers novel, diverse, and high-performing organic
(quinone-based) cathode materials for aqueous redox flow battery applications.

## Full-text entities

- **Chemicals:** quinone (MESH:C004532)

## Figures

50 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12810395/full.md

---
Source: https://tomesphere.com/paper/PMC12810395