A study of EHVI vs fixed scalarization for molecule design

Anabel Yong; Austin Tripp; Layla Hosseini-Gerami; Brooks Paige

arXiv:2507.13704·cs.LG·December 25, 2025

A study of EHVI vs fixed scalarization for molecule design

Anabel Yong, Austin Tripp, Layla Hosseini-Gerami, Brooks Paige

PDF

Open Access

TL;DR

This paper compares Pareto-based Expected Hypervolume Improvement (EHVI) with scalarized Expected Improvement (EI) in molecular design, demonstrating EHVI's superior performance in Pareto front coverage, convergence, and diversity across multiple tasks.

Contribution

It provides the first controlled benchmark showing the empirical advantages of EHVI over scalarized EI in molecular optimization tasks.

Findings

01

EHVI outperforms scalarized EI in Pareto front coverage

02

EHVI converges faster in molecular optimization tasks

03

EHVI maintains greater chemical diversity

Abstract

Multi-objective Bayesian optimization (MOBO) provides a principled framework for navigating trade-offs in molecular design. However, its empirical advantages over scalarized alternatives remain underexplored. We benchmark a simple Pareto-based MOBO strategy - Expected Hypervolume Improvement (EHVI) - against a simple fixed-weight scalarized baseline using Expected Improvement (EI), under a tightly controlled setup with identical Gaussian Process surrogates and molecular representations. Across three molecular optimization tasks, EHVI consistently outperforms scalarized EI in terms of Pareto front coverage, convergence speed, and chemical diversity. While scalarization encompasses flexible variants - including random or adaptive schemes - our results show that even strong deterministic instantiations can underperform in low-data regimes. These findings offer concrete evidence for the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Multi-Objective Optimization Algorithms · Machine Learning in Materials Science · Gaussian Processes and Bayesian Inference