Language model driven: a PROTAC generation pipeline with dual constraints of structure and property
Jinsong Shao, Qineng Gong, Zeyu Yin, Yu Chen, Yajie Hao, Lei Zhang,, Linlin Jiang, Min Yao, Jinlong Li, Fubo Wang, Li Wang

TL;DR
This paper introduces LM-PROTAC, an AI-driven pipeline using a transformer-based model with dual constraints to design PROTAC molecules targeting specific proteins, demonstrated on Wnt3a.
Contribution
The study presents a novel language model-based pipeline with dual structure and property constraints for efficient PROTAC molecule generation.
Findings
Successfully generated PROTAC molecules targeting Wnt3a
Demonstrated the pipeline's ability to meet structural and property constraints
Validated generated molecules through in vitro experiments
Abstract
The imperfect modeling of ternary complexes has limited the application of computer-aided drug discovery tools in PROTAC research and development. In this study, an AI-assisted approach for PROTAC molecule design pipeline named LM-PROTAC was developed, which stands for language model driven Proteolysis Targeting Chimera, by embedding a transformer-based generative model with dual constraints on structure and properties, referred to as the DCT. This study utilized the fragmentation representation of molecules and developed a language model driven pipeline. Firstly, a language model driven affinity model for protein compounds to screen molecular fragments with high affinity for the target protein. Secondly, structural and physicochemical properties of these fragments were constrained during the generation process to meet specific scenario requirements. Finally, a two-round screening of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFragmentation · Chimera
