Protein Structure and Sequence Generation with Equivariant Denoising Diffusion Probabilistic Models
Namrata Anand, Tudor Achim

TL;DR
This paper introduces a novel equivariant denoising diffusion probabilistic model for generating detailed protein structures and sequences, capable of handling larger scales than previous models, learned solely from experimental data.
Contribution
The work presents a scalable, data-driven generative model for protein structure and sequence that conditions on topology, advancing molecular modeling capabilities.
Findings
Produces high-quality protein structure and sequence samples
Operates at larger scales than prior models
Demonstrates effectiveness through qualitative and quantitative analyses
Abstract
Proteins are macromolecules that mediate a significant fraction of the cellular processes that underlie life. An important task in bioengineering is designing proteins with specific 3D structures and chemical properties which enable targeted functions. To this end, we introduce a generative model of both protein structure and sequence that can operate at significantly larger scales than previous molecular generative modeling approaches. The model is learned entirely from experimental data and conditions its generation on a compact specification of protein topology to produce a full-atom backbone configuration as well as sequence and side-chain predictions. We demonstrate the quality of the model via qualitative and quantitative analysis of its samples. Videos of sampling trajectories are available at https://nanand2.github.io/proteins .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Biomedical Text Mining and Ontologies
