CVT Archives and Chemical Embedding Measures for Multi-Objective Quality Diversity in Molecular Design
Dominic Mashak, Jacob Schrum

TL;DR
This paper introduces a novel archive method using chemical embeddings and CVT for multi-objective molecular design, improving diversity and quality over traditional grid-based approaches.
Contribution
It applies CVT archives with learned chemical embeddings to enhance diversity and efficiency in multi-objective molecular optimization.
Findings
Embedding-based CVT archives outperform grid-based archives in hypervolume.
The method achieves higher multi-objective quality diversity scores.
Nearly all native archive niches are filled with the proposed approach.
Abstract
Nonlinear optical (NLO) materials are essential for photonic technologies, yet discovering optimal NLO molecules requires balancing multiple competing objectives across vast chemical spaces. Previous work showed that Multi-Objective MAP-Elites (MOME) with grid-based archives discovers diverse, high-quality molecules for electro-optic applications. However, uniform grid partitioning wastes archive capacity on chemically infeasible regions while undersampling high-density areas. We apply MOME with Centroidal Voronoi Tessellation (CVT) archives whose cells are defined by learned embeddings from ChemBERTa-2 Multi-Task Regression reduced via UMAP, capturing chemical similarity beyond simple structural features. We investigate a four-objective NLO molecular design problem: maximizing the hyperpolarizability ratio, constraining HOMO-LUMO gap and linear polarizability to target…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
