ScaffoldGPT: A Scaffold-based GPT Model for Drug Optimization
Xuefeng Liu, Songhao Jiang, Ian Foster, Jinbo Xu, Rick Stevens

TL;DR
ScaffoldGPT is a novel transformer-based model designed for drug optimization that effectively balances preserving original scaffolds with enhancing desired drug properties, outperforming existing methods on COVID and cancer benchmarks.
Contribution
The paper introduces ScaffoldGPT, a three-stage optimization approach with a new pre-training strategy and token-level decoding, advancing scaffold-based drug design.
Findings
Outperforms baselines in drug optimization benchmarks
Preserves original functional scaffolds effectively
Enhances desired drug properties in experiments
Abstract
Drug optimization has become increasingly crucial in light of fast-mutating virus strains and drug-resistant cancer cells. Nevertheless, it remains challenging as it necessitates retaining the beneficial properties of the original drug while simultaneously enhancing desired attributes beyond its scope. In this work, we aim to tackle this challenge by introducing ScaffoldGPT, a novel Generative Pretrained Transformer (GPT) designed for drug optimization based on molecular scaffolds. Our work comprises three key components: (1) A three-stage drug optimization approach that integrates pretraining, finetuning, and decoding optimization. (2) A novel two-phase incremental pre-training strategy for scaffold-based drug optimization. (3) A token-level decoding optimization strategy, Top-N, that enabling controlled, reward-guided generation using the pretrained or finetuned GPT. We demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Healthcare · Biomedical Text Mining and Ontologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Cosine Annealing · Weight Decay · Residual Connection · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Attention Dropout · Absolute Position Encodings · Label Smoothing
