Physio-DPO: Aligning Large Language Models with the Protein Energy Landscape to Eliminate Structural Hallucinations
QiWei Meng

TL;DR
Physio-DPO is a physics-informed alignment method for protein language models that reduces structural hallucinations by incorporating thermodynamic stability, leading to more accurate and foldable protein structures.
Contribution
It introduces a magnitude aware objective that aligns protein models with the energy landscape, improving stability and reducing hallucinations compared to existing methods.
Findings
Reduces RMSD to 1.28 Å
Achieves 92.8% foldability
Mitigates structural hallucinations effectively
Abstract
Large Protein Language Models have shown strong potential for generative protein design, yet they frequently produce structural hallucinations, generating sequences with high linguistic likelihood that fold into thermodynamically unstable conformations. Existing alignment approaches such as Direct Preference Optimization are limited in this setting, as they model preferences as binary labels and ignore the continuous structure of the physical energy landscape. We propose Physio-DPO, a physics informed alignment framework that grounds protein language models in thermodynamic stability. Physio-DPO introduces a magnitude aware objective that scales optimization updates according to the energy gap between native structures and physics perturbed hard negatives. Experiments show that Physio-DPO consistently outperforms strong baselines including SFT, PPO, and standard DPO, reducing self…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Machine Learning in Materials Science · RNA and protein synthesis mechanisms
