Dynamics-inspired Structure Hallucination for Protein-protein Interaction Modeling
Fang Wu, Stan Z. Li

TL;DR
This paper introduces Refine-PPI, a novel deep learning framework that predicts protein-protein interactions by hallucinating mutant structures and modeling dynamic geometric uncertainties, significantly improving prediction accuracy.
Contribution
The paper presents a structure refinement module trained on wild-type proteins and a new geometric network, PDC-Net, to better model dynamic PPI variations and mutant structures.
Findings
Outperforms existing tools in predicting free energy change.
Effectively hallucinate mutant structures from wild-type data.
Accurately models geometric uncertainty in PPI.
Abstract
Protein-protein interaction (PPI) represents a central challenge within the biology field, and accurately predicting the consequences of mutations in this context is crucial for drug design and protein engineering. Deep learning (DL) has shown promise in forecasting the effects of such mutations, but is hindered by two primary constraints. First, the structures of mutant proteins are often elusive to acquire. Secondly, PPI takes place dynamically, which is rarely integrated into the DL architecture design. To address these obstacles, we present a novel framework named Refine-PPI with two key enhancements. First, we introduce a structure refinement module trained by a mask mutation modeling (MMM) task on available wild-type structures, which is then transferred to produce the inaccessible mutant structures. Second, we employ a new kind of geometric network, called the probability density…
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
The proposed generalization of EGNN to the set of gaussians is a sound idea, backed by the results in section 4.3.3. The method of measuring free energy of association is not exactly novel, but still represents application of ideas from denoising pretraining to the old field of computational structural biology.
1. Generalization of EGNN to the sets of gaussians is trivial, from the text of the paper it seems that according to the eq. 5, 6 the updates to the average and variance are not coupled, because phi_mu and phi_sigma are two independently learned functions. Appendix B.2 somewhat corrects this approach. However, on lines 250, 262 authors treat functions of gaussian distributions correctly. In general, the introduction of PDC-Net is unclear in whether it treats set of gaussians as probability densi
- relevant research topic - the suggested probability density cloud (PDC) seems interesting (maybe also for other applications than PPI): I don't remember to have seen such an extension of EGNN (Satorras et al. (2021)) before, but also didn't search extensively, whether it might have been proposed previously already. - clustering of protein chains in the benchmark (chain cluster split up into train/val/test instead of protein chains themselves)
The description of the Refine-PPI workflow is unclear at certain points. My best guess is that the authors suggest a denoising task (==MMM) on the wildtype structure for training. The motivation/post-analysis for MMM did not make it clear for me, why this might be especially useful for ddG prediction. Why do authors think that learning to denoise wildtype structures is advantagous for better ddG prediction? Furhter presentation points unclear: - Figure 3A: It seems that something is masked at t
Compared to previous studies, this paper enables predicting the mutated protein structure and the change in binding free energy simultaneously and also considers the dynamics and flexibility of conformation during the binding process. The study is well-executed and provides detailed experimental results and a comprehensive comparative analysis to demonstrate the model’s improvement over baseline models in predicting free energy changes. The paper generally provides a clear explanation of the me
1. The paper may lack a clear justification for why incorporating dynamic properties into geometric GNNs enhances model performance relative to traditional static approaches. It would be helpful to demonstrate more about the motivations for using PDC-Net and its role in predicting changes in binding affinity. 2. There is a notable formatting issue near Figure 4 and the associated paragraph, where overlapping text obscures readability. 3. The definitions of various notations in Sections 2 and 3
- The proposed method "PDC-Net" is able to propagate distributions of point clouds through EGNNs, which appears novel and relevant, including the innate uncertainty values. I think the paper should revolve around that method and use it for multiple applications.
- Lack of clarity: It remains unclear what the actual contribution is and what the main theme of the paper is. The title seems to indicate that one main component will be method that produces "halluzations", but later this is hardly picked on. Instead it appears that the masked mutation modeling is a core component, but a bit later again the PDC-Net is crucial. Similarly, the application area remains obscure: is it about predicting protein-protein-interactions or about changes in 3D conformation
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Computational Drug Discovery Methods · Bioinformatics and Genomic Networks
