AlphaDesign: A graph protein design method and benchmark on AlphaFoldDB
Zhangyang Gao, Cheng Tan, Stan Z. Li

TL;DR
This paper introduces AlphaDesign, a new benchmark for protein design based on AlphaFoldDB, and proposes ADesign, a graph-based method that improves accuracy and efficiency in predicting protein sequences from structures.
Contribution
The paper establishes a large-scale standardized benchmark for protein design and introduces ADesign, a novel graph transformer-based method with enhanced features and efficiency.
Findings
ADesign outperforms previous models in accuracy by 8%
ADesign achieves over 40 times faster inference speed
The benchmark facilitates standardized comparisons in protein design
Abstract
While DeepMind has tentatively solved protein folding, its inverse problem -- protein design which predicts protein sequences from their 3D structures -- still faces significant challenges. Particularly, the lack of large-scale standardized benchmark and poor accuray hinder the research progress. In order to standardize comparisons and draw more research interest, we use AlphaFold DB, one of the world's largest protein structure databases, to establish a new graph-based benchmark -- AlphaDesign. Based on AlphaDesign, we propose a new method called ADesign to improve accuracy by introducing protein angles as new features, using a simplified graph transformer encoder (SGT), and proposing a confidence-aware protein decoder (CPD). Meanwhile, SGT and CPD also improve model efficiency by simplifying the training and testing procedures. Experiments show that ADesign significantly outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Microbial Metabolic Engineering and Bioproduction · Machine Learning in Bioinformatics
MethodsAttention Is All You Need · Linear Layer · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Multi-Head Attention · Laplacian EigenMap · Absolute Position Encodings · Byte Pair Encoding · Label Smoothing · Dense Connections · Residual Connection
