EnerBridge-DPO: Energy-Guided Protein Inverse Folding with Markov Bridges and Direct Preference Optimization

Dingyi Rong; Haotian Lu; Wenzhuo Zheng; Fan Zhang; Shuangjia Zheng; Ning Liu

arXiv:2506.09496·cs.LG·June 12, 2025

EnerBridge-DPO: Energy-Guided Protein Inverse Folding with Markov Bridges and Direct Preference Optimization

Dingyi Rong, Haotian Lu, Wenzhuo Zheng, Fan Zhang, Shuangjia Zheng, Ning Liu

PDF

Open Access 3 Reviews

TL;DR

EnerBridge-DPO is a novel protein inverse folding framework that combines Markov Bridges and Direct Preference Optimization to generate low-energy, stable protein sequences, improving energy efficiency while maintaining high sequence recovery.

Contribution

The paper introduces EnerBridge-DPO, integrating Markov Bridges with DPO and explicit energy constraints to enhance energy-aware protein sequence design.

Findings

01

Designs sequences with lower energy than existing models

02

Maintains high sequence recovery rates comparable to state-of-the-art methods

03

Accurately predicts $\\Delta \\Delta G$ values for sequence stability

Abstract

Designing protein sequences with optimal energetic stability is a key challenge in protein inverse folding, as current deep learning methods are primarily trained by maximizing sequence recovery rates, often neglecting the energy of the generated sequences. This work aims to overcome this limitation by developing a model that directly generates low-energy, stable protein sequences. We propose EnerBridge-DPO, a novel inverse folding framework focused on generating low-energy, high-stability protein sequences. Our core innovation lies in: First, integrating Markov Bridges with Direct Preference Optimization (DPO), where energy-based preferences are used to fine-tune the Markov Bridge model. The Markov Bridge initiates optimization from an information-rich prior sequence, providing DPO with a pool of structurally plausible sequence candidates. Second, an explicit energy constraint loss is…

Peer Reviews

Decision·ICLR 2026 Conference Withdrawn Submission

Reviewer 01Rating 2Confidence 4

Strengths

1. The motivation is well grounded: effective sequence design should explicitly favor lower-energy sequences.

Weaknesses

1. The methodological novelty appears very limited. The bridge-based generative component and several architectural/training choices (e.g., Markov Bridge formulation, PLM backbone with AdaLN-Bias and structural adapters, frozen base weights) closely track Bridge-IF [1], and the added DPO fine-tuning for lower energy reads as a relatively incremental extension rather than a fundamentally new framework. 2. The evaluation omits designability metrics, which are critical alongside stability/energy.

Reviewer 02Rating 4Confidence 4

Strengths

Combination of Markov Bridge and DPO. The combination of Markov bridges (for structured stochastic refinement) with DPO (for preference-based alignment) is new and elegant. The paper demonstrates a theoretically consistent formulation where probabilistic modeling (via bridge processes) and preference learning (via DPO) jointly improve protein design. Interesting empirical results. Ablation studies confirm the necessity of both DPO fine-tuning and energy supervision in various downstream results

Weaknesses

Energy-aware learning objective. The inclusion of a Boltzmann-aligned energy loss attempts to ground the generative process in physical thermodynamics in order to make the model interpretable and biologically relevant. However, the assumption that model probability strongly correlates with free energy is a poor decision by the author for several reasons. The biggest reason is that the datasets used for ∆∆Gbind are very noisy datasets with heterogenous analytical methods used for data collection,

Reviewer 03Rating 4Confidence 3

Strengths

+ Combines Markov Bridge generative modeling with Direct Preference Optimization, introducing energy-based fine-tuning into protein inverse folding for the first time. + Incorporates explicit energy constraints and ΔΔG prediction, aligning learned representations with biophysical energy landscapes. + Demonstrates lower energy, stable protein designs, and competitive recovery rates across multiple benchmarks with solid ablation analyses.

Weaknesses

+ The model’s energy improvements rely on computational predictors (FoldX, Rosetta, BA-Cycle) without experimental or molecular dynamics confirmation. + DPO fine-tuning depends on precomputed or predicted energy scores, which may introduce bias and limit generalization to unseen proteins. + The paper lacks discussion on computational cost, hyperparameter sensitivity (e.g., β in DPO), and robustness across large or diverse protein complexes.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProtein Structure and Dynamics · Machine Learning in Bioinformatics · RNA and protein synthesis mechanisms