Drug Discovery SMILES-to-Pharmacokinetics Diffusion Models with Deep Molecular Understanding
Bing Hu, Anita Layton, Helen Chen

TL;DR
Imagand is a novel diffusion model that generates pharmacokinetic data from SMILES strings, addressing data sparsity in drug discovery and enhancing downstream task performance.
Contribution
We introduce Imagand, a diffusion model that predicts pharmacokinetic properties from SMILES, improving data generation and utility in drug discovery.
Findings
Generated PK data closely matches real data distributions
Improves performance in downstream drug discovery tasks
Addresses data overlap sparsity in pharmacokinetic datasets
Abstract
Artificial intelligence (AI) is increasingly used in every stage of drug development. One challenge facing drug discovery AI is that drug pharmacokinetic (PK) datasets are often collected independently from each other, often with limited overlap, creating data overlap sparsity. Data sparsity makes data curation difficult for researchers looking to answer research questions in poly-pharmacy, drug combination research, and high-throughput screening. We propose Imagand, a novel SMILES-to-Pharmacokinetic (S2PK) diffusion model capable of generating an array of PK target properties conditioned on SMILES inputs. We show that Imagand-generated synthetic PK data closely resembles real data univariate and bivariate distributions, and improves performance for downstream tasks. Imagand is a promising solution for data overlap sparsity and allows researchers to efficiently generate ligand PK data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods
MethodsDiffusion
