TL;DR
ChemLML introduces a modular adapter-based approach to generate molecules from text by leveraging pretrained models, enabling efficient and flexible multi-modal molecule generation without training from scratch.
Contribution
It presents ChemLML, a lightweight adapter strategy that blends pretrained text and molecular models for conditional molecule generation, highlighting the impact of molecular representations.
Findings
SMILES representation often yields better generation performance.
ChemLML can generate candidate molecules for drug discovery tasks.
Filtered dataset improves evaluation reliability.
Abstract
The development of large language models and multi-modal models has enabled the appealing idea of generating novel molecules from text descriptions. Generative modeling would shift the paradigm from relying on large-scale chemical screening to find molecules with desired properties to directly generating those molecules. However, multi-modal models combining text and molecules are often trained from scratch, without leveraging existing high-quality pretrained models. Training from scratch consumes more computational resources and prohibits model scaling. In contrast, we propose a lightweight adapter-based strategy named Chemical Language Model Linker (ChemLML). ChemLML blends the two single domain models and obtains conditional molecular generation from text descriptions while still operating in the specialized embedding spaces of the molecular domain. ChemLML can tailor diverse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAdapter
