PLD-Tree: Persistent Laplacian Decision Tree for Protein-Protein Binding Free Energy Prediction
Xingjian Xu, Jiahui Chen, Chunmei Wang

TL;DR
PLD-Tree is a novel topology-based machine learning method that accurately predicts protein-protein binding affinities by combining persistent Laplacian topological descriptors with sequence information, outperforming existing methods.
Contribution
Introduces PLD-Tree, integrating persistent Laplacian topological features with ESM sequence data for improved protein-protein affinity prediction.
Findings
Achieves a correlation coefficient of 0.83 on benchmark datasets.
Outperforms all existing state-of-the-art methods.
Validates effectiveness of topology-based descriptors in molecular affinity prediction.
Abstract
Recent advances in topology-based modeling have accelerated progress in physical modeling and molecular studies, including applications to protein-ligand binding affinity. In this work, we introduce the Persistent Laplacian Decision Tree (PLD-Tree), a novel method designed to address the challenging task of predicting protein-protein interaction (PPI) affinities. PLD-Tree focuses on protein chains at binding interfaces and employs the persistent Laplacian to capture topological invariants reflecting critical inter-protein interactions. These topological descriptors, derived from persistent homology, are further enhanced by incorporating evolutionary scale modeling (ESM) from a large language model to integrate sequence-based information. We validate PLD-Tree on two benchmark datasets-PDBbind V2020 and SKEMPI v2 demonstrating a correlation coefficient () of 0.83 under the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Protein Structure and Dynamics · Bioinformatics and Genomic Networks
