MolLIBRA: Genetic Molecular Optimization with Multi-Fingerprint Surrogates and Text-Molecule Aligned Critic
Masahi Okada, Kazuki Sakai, Hiroaki Yoshida, Masaki Okoshi, Tadahiro Taniguchi

TL;DR
MolLIBRA is a novel molecular optimization framework that combines multi-fingerprint surrogates and text-molecule aligned critics to efficiently identify promising molecules within limited oracle evaluation budgets.
Contribution
It introduces a hybrid approach using Gaussian process surrogates and a pretrained text-molecule encoder to improve sample efficiency in molecular optimization.
Findings
MolLIBRA-L achieves top performance on the PMO benchmark with 1,000 evaluations.
The method effectively combines fingerprint-based surrogates and language models for molecule scoring.
It outperforms prior methods in Top-10 AUC across multiple tasks.
Abstract
We study sample-efficient molecular optimization under a limited budget of oracle evaluations. We propose MolLIBRA (MultimOdaLity and Language Integrated Bayesian and evolutionaRy optimizAtion), a genetic algorithm based framework that pre-ranks candidate molecules using multiple critics before oracle calls: (i) an ensemble of Gaussian process (GP) surrogates defined over multiple molecular fingerprints and (ii) a pretrained text-molecule aligned encoder CLAMP. The GP ensemble enables adaptive selection of task-appropriate fingerprints, while CLAMP provides a zero-shot scoring signal from task descriptions by measuring the similarity between molecular and text embeddings. On the Practical Molecular Optimization (PMO) benchmark with a budget of 1,000 evaluations (PMO-1K), MolLIBRA-L, our variant with a language-model-based candidate generator, attains the best Top-10 AUC on 14/22 tasks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Machine Learning and Data Classification
