From In Silico to In Vitro: Evaluating Molecule Generative Models for Hit Generation
Nagham Osman, Vittorio Lembo, Giovanni Bottegoni, Laura Toni

TL;DR
This study evaluates whether deep learning generative models can effectively produce hit-like molecules for drug discovery, demonstrating promising results with some compounds confirmed active in vitro and highlighting current limitations.
Contribution
First to explicitly frame hit-like molecule generation as a standalone task and empirically test generative models' effectiveness in this stage of drug discovery.
Findings
Models can generate valid, diverse, and biologically relevant compounds.
Some generated compounds, including GSK-3β hits, were synthesized and confirmed active in vitro.
Current evaluation metrics and training data limitations were identified.
Abstract
Hit identification is a critical yet resource-intensive step in the drug discovery pipeline, traditionally relying on high-throughput screening of large compound libraries. Despite advancements in virtual screening, these methods remain time-consuming and costly. Recent progress in deep learning has enabled the development of generative models capable of learning complex molecular representations and generating novel compounds de novo. However, using ML to replace the entire drug-discovery pipeline is highly challenging. In this work, we rather investigate whether generative models can replace one step of the pipeline: hit-like molecule generation. To the best of our knowledge, this is the first study to explicitly frame hit-like molecule generation as a standalone task and empirically test whether generative models can directly support this stage of the drug discovery pipeline.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Cell Image Analysis Techniques · Machine Learning in Materials Science
