Property-driven Protein Inverse Folding With Multi-Objective Preference Alignment
Xiaoyang Hou, Junqi Liu, Chence Shi, Xin Liu, Zhi Yang, Jian Tang

TL;DR
ProtAlign is a multi-objective framework that fine-tunes inverse folding models to optimize multiple developability properties of proteins while maintaining structural accuracy, addressing a key challenge in protein design.
Contribution
It introduces ProtAlign, a novel semi-online preference optimization method that balances multiple protein developability objectives during inverse folding model fine-tuning.
Findings
MoMPNN improves developability metrics without losing designability.
The framework is effective across various protein design tasks.
ProtAlign enhances practical protein sequence design applications.
Abstract
Protein sequence design must balance designability, defined as the ability to recover a target backbone, with multiple, often competing, developability properties such as solubility, thermostability, and expression. Existing approaches address these properties through post hoc mutation, inference-time biasing, or retraining on property-specific subsets, yet they are target dependent and demand substantial domain expertise or careful hyperparameter tuning. In this paper, we introduce ProtAlign, a multi-objective preference alignment framework that fine-tunes pretrained inverse folding models to satisfy diverse developability objectives while preserving structural fidelity. ProtAlign employs a semi-online Direct Preference Optimization strategy with a flexible preference margin to mitigate conflicts among competing objectives and constructs preference pairs using in silico property…
Peer Reviews
Decision·ICLR 2026 Poster
See summary
See summary
- Method is simple and general: multi-objective DPO with an adaptive preference margin to mitigate conflicts across properties; the training pipeline evenly samples pairwise entries across properties and alternates rollout and training for efficiency. - Practical semi-online training decouples rollout/evaluation from optimization, enabling batch computation and easier deployment while retaining online exploration benefits. - Evaluations are broad and application-relevant: crystal redesign, de
- Limited ablations on multi-objective weights and margin settings. It might be helpful to quantify how weights, temperature, and margin thresholds shape the Pareto front and to provide transferable default configurations as the paper heavily relies on it. - The adaptive preference margin m(yw,yl) is precomputed from auxiliary property deltas and then kept fixed during training. This is simple and fast, but it cannot react if the policy distribution drifts, predictors recalibrate, or property tr
This method improves developability metrics using a preference alignment framework , which does not require additional specific, curated datasets of experimentally-validated proteins. The authors evaluate MoMPNN on a strong set of tasks beyond standard sequence recovery. This includes redesigning CATH 4.3 crystal structures, designing sequences for de novo generated backbones, and a practical de novo binder design scenario. This rigorous evaluation demonstrates the method's utility in realistic
It would be better to report the metrics on ground truth sequences, as these metrics are based on prediction models as approximations. Full names of abbr.’s in tables are missing in the captions. The temperatures used in inference of different baselines are not identical, resulting in potentially unfair comparison. A fair comparison would be either the greedy strategy (without temperature), or comparing the best point on the temperature-performance curves between different methods; or at least
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Enzyme Structure and Function · Protein purification and stability
