NEBULA: Neural Empirical Bayes Under Latent Representations for Efficient and Controllable Design of Molecular Libraries
Ewa M. Nowara, Pedro O. Pinheiro, Sai Pooja Mahajan, Omar Mahmood,, Andrew Martin Watkins, Saeed Saremi, Michael Maser

TL;DR
NEBULA is a novel latent 3D generative model that efficiently produces large, high-quality molecular libraries around a seed compound, significantly faster than previous methods, and generalizes well to unseen drug-like molecules.
Contribution
It introduces a scalable, fast, and high-quality 3D molecular library generation method using neural empirical Bayes in a learned latent space, outperforming existing voxel-based approaches.
Findings
NEBULA generates molecular libraries nearly ten times faster than prior methods.
It maintains high sample quality comparable to voxel-based models.
The model generalizes effectively to unseen drug-like molecules across multiple datasets.
Abstract
We present NEBULA, the first latent 3D generative model for scalable generation of large molecular libraries around a seed compound of interest. Such libraries are crucial for scientific discovery, but it remains challenging to generate large numbers of high quality samples efficiently. 3D-voxel-based methods have recently shown great promise for generating high quality samples de novo from random noise (Pinheiro et al., 2023). However, sampling in 3D-voxel space is computationally expensive and use in library generation is prohibitively slow. Here, we instead perform neural empirical Bayes sampling (Saremi & Hyvarinen, 2019) in the learned latent space of a vector-quantized variational autoencoder. NEBULA generates large molecular libraries nearly an order of magnitude faster than existing methods without sacrificing sample quality. Moreover, NEBULA generalizes better to unseen…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods
MethodsLib
