EnerBridge-DPO: Energy-Guided Protein Inverse Folding with Markov Bridges and Direct Preference Optimization
Dingyi Rong, Haotian Lu, Wenzhuo Zheng, Fan Zhang, Shuangjia Zheng, Ning Liu

TL;DR
EnerBridge-DPO is a novel protein inverse folding framework that combines Markov Bridges and Direct Preference Optimization to generate low-energy, stable protein sequences, improving energy efficiency while maintaining high sequence recovery.
Contribution
The paper introduces EnerBridge-DPO, integrating Markov Bridges with DPO and explicit energy constraints to enhance energy-aware protein sequence design.
Findings
Designs sequences with lower energy than existing models
Maintains high sequence recovery rates comparable to state-of-the-art methods
Accurately predicts $\\Delta \\Delta G$ values for sequence stability
Abstract
Designing protein sequences with optimal energetic stability is a key challenge in protein inverse folding, as current deep learning methods are primarily trained by maximizing sequence recovery rates, often neglecting the energy of the generated sequences. This work aims to overcome this limitation by developing a model that directly generates low-energy, stable protein sequences. We propose EnerBridge-DPO, a novel inverse folding framework focused on generating low-energy, high-stability protein sequences. Our core innovation lies in: First, integrating Markov Bridges with Direct Preference Optimization (DPO), where energy-based preferences are used to fine-tune the Markov Bridge model. The Markov Bridge initiates optimization from an information-rich prior sequence, providing DPO with a pool of structurally plausible sequence candidates. Second, an explicit energy constraint loss is…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
1. The motivation is well grounded: effective sequence design should explicitly favor lower-energy sequences.
1. The methodological novelty appears very limited. The bridge-based generative component and several architectural/training choices (e.g., Markov Bridge formulation, PLM backbone with AdaLN-Bias and structural adapters, frozen base weights) closely track Bridge-IF [1], and the added DPO fine-tuning for lower energy reads as a relatively incremental extension rather than a fundamentally new framework. 2. The evaluation omits designability metrics, which are critical alongside stability/energy.
Combination of Markov Bridge and DPO. The combination of Markov bridges (for structured stochastic refinement) with DPO (for preference-based alignment) is new and elegant. The paper demonstrates a theoretically consistent formulation where probabilistic modeling (via bridge processes) and preference learning (via DPO) jointly improve protein design. Interesting empirical results. Ablation studies confirm the necessity of both DPO fine-tuning and energy supervision in various downstream results
Energy-aware learning objective. The inclusion of a Boltzmann-aligned energy loss attempts to ground the generative process in physical thermodynamics in order to make the model interpretable and biologically relevant. However, the assumption that model probability strongly correlates with free energy is a poor decision by the author for several reasons. The biggest reason is that the datasets used for ∆∆Gbind are very noisy datasets with heterogenous analytical methods used for data collection,
+ Combines Markov Bridge generative modeling with Direct Preference Optimization, introducing energy-based fine-tuning into protein inverse folding for the first time. + Incorporates explicit energy constraints and ΔΔG prediction, aligning learned representations with biophysical energy landscapes. + Demonstrates lower energy, stable protein designs, and competitive recovery rates across multiple benchmarks with solid ablation analyses.
+ The model’s energy improvements rely on computational predictors (FoldX, Rosetta, BA-Cycle) without experimental or molecular dynamics confirmation. + DPO fine-tuning depends on precomputed or predicted energy scores, which may introduce bias and limit generalization to unseen proteins. + The paper lacks discussion on computational cost, hyperparameter sensitivity (e.g., β in DPO), and robustness across large or diverse protein complexes.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Machine Learning in Bioinformatics · RNA and protein synthesis mechanisms
