Pan-protein Design Learning Enables Task-adaptive Generalization for Low-resource Enzyme Design
Jiangbin Zheng, Ge Wang, Han Zhang, Stan Z. Li

TL;DR
This paper introduces CrossDesign, a domain-adaptive framework that leverages pretrained protein language models to enable task-specific enzyme design, especially in low-data scenarios, demonstrating superior performance and robustness.
Contribution
The work presents a novel CPD paradigm and a domain-adaptive framework that effectively transfers knowledge from protein language models to structure models for enzyme design.
Findings
CrossDesign outperforms existing methods on enzyme datasets.
The framework demonstrates robustness with out-of-domain enzymes.
It achieves accurate fitness prediction on mutation data.
Abstract
Computational protein design (CPD) offers transformative potential for bioengineering, but current deep CPD models, focused on universal domains, struggle with function-specific designs. This work introduces a novel CPD paradigm tailored for functional design tasks, particularly for enzymes-a key protein class often lacking specific application efficiency. To address structural data scarcity, we present CrossDesign, a domain-adaptive framework that leverages pretrained protein language models (PPLMs). By aligning protein structures with sequences, CrossDesign transfers pretrained knowledge to structure models, overcoming the limitations of limited structural data. The framework combines autoregressive (AR) and non-autoregressive (NAR) states in its encoder-decoder architecture, applying it to enzyme datasets and pan-proteins. Experimental results highlight CrossDesign's superior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Protein purification and stability · Machine Learning in Materials Science
