General Multimodal Protein Design Enables DNA-Encoding of Chemistry
Jarrid Rector-Brooks, Th\'eophile Lambert, Marta Skreta, Daniel Roth, Yueming Long, Zi-Qi Li, Xi Zhang, Miruna Cretu, Francesca-Zhoufan Li, Tanvi Ganapathy, Emily Jin, Avishek Joey Bose, Jason Yang, Kirill Neklyudov, Yoshua Bengio, Alexander Tong, Frances H. Arnold, Cheng-Hao Liu

TL;DR
DISCO is a multimodal deep generative model that co-designs protein sequences and structures to create novel enzymes capable of catalyzing new-to-nature reactions, with high activity and evolvability.
Contribution
Introduces DISCO, a scalable multimodal model for protein design that enables creation of enzymes with novel active sites and catalytic functions from reactive intermediates.
Findings
Designed heme enzymes catalyzing new-to-nature reactions with high activity.
DISCO's enzyme designs outperform engineered enzymes in activity.
Directed mutagenesis further improved enzyme activity.
Abstract
Evolution is an extraordinary engine for enzymatic diversity, yet the chemistry it has explored remains a narrow slice of what DNA can encode. Deep generative models can design new proteins that bind ligands, but none have created enzymes without pre-specifying catalytic residues. We introduce DISCO (DIffusion for Sequence-structure CO-design), a multimodal model that co-designs protein sequence and 3D structure around arbitrary biomolecules, as well as inference-time scaling methods that optimize objectives across both modalities. Conditioned solely on reactive intermediates, DISCO designs diverse heme enzymes with novel active-site geometries. These enzymes catalyze new-to-nature carbene-transfer reactions, including alkene cyclopropanation, spirocyclopropanation, B-H, and C(sp)-H insertions, with high activities exceeding those of engineered enzymes. Random mutagenesis of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
