ProteinOPD: Towards Effective and Efficient Preference Alignment for Protein Design
Yulin Zhang, He Cao, Zihao Jiang, Chenyi Zi, Zhipeng Zhou, Zijing Liu, Yu Li, Jia Li, Ziqi Gao

TL;DR
ProteinOPD introduces a multi-objective preference alignment framework for protein design that balances multiple goals while preserving the inherent designability of pretrained protein language models, achieving faster training.
Contribution
It adapts On-Policy Distillation to multi-objective preference alignment in protein language models, effectively balancing multiple goals and maintaining designability.
Findings
Achieves substantial gains on target preference objectives.
Maintains the inherent designability of protein language models.
Speeds up training by 8 times compared to RL-based methods.
Abstract
Designing proteins with desired functions or properties represents a core goal in synthetic biology and drug discovery. Recent advances in protein language models (PLMs) have enabled the generation of highly designable protein sequences, while preference alignment provides a promising way to steer designs toward desired functions and properties. Nevertheless, they often trigger catastrophic forgetting of pretrained knowledge, degrading basic designability and failing to balance multiple competing objectives. To address these issues, we draw inspiration from On-Policy Distillation (OPD), an advanced post-training method renowned for mitigating catastrophic forgetting through its mode-seeking nature. In this work, we propose ProteinOPD, a multi-objective preference alignment framework that can effectively balance multiple preference objectives while maintaining the inherent designability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
