Investigating Knowledge Distillation Through Neural Networks for Protein Binding Affinity Prediction
Wajid Arshad Abbasi, Syed Ali Abbas, Maryum Bibi, Saiqa Andleeb, Muhammad Naveed Akhtar

TL;DR
This paper presents a knowledge distillation framework that transfers structural information from structure-based models to sequence-based models for protein binding affinity prediction, improving accuracy without requiring structural data during inference.
Contribution
It introduces a novel distillation method that leverages structural data during training to enhance sequence-only protein binding affinity prediction models.
Findings
Distillation-based model outperforms sequence-only baseline.
Improved Pearson correlation coefficient from 0.375 to 0.481.
Reduced prediction error and bias in affinity estimates.
Abstract
The trade-off between predictive accuracy and data availability makes it difficult to predict protein--protein binding affinity accurately. The lack of experimentally resolved protein structures limits the performance of structure-based machine learning models, which generally outperform sequence-based methods. In order to overcome this constraint, we suggest a regression framework based on knowledge distillation that uses protein structural data during training and only needs sequence data during inference. The suggested method uses binding affinity labels and intermediate feature representations to jointly supervise the training of a sequence-based student network under the guidance of a structure-informed teacher network. Leave-One-Complex-Out (LOCO) cross-validation was used to assess the framework on a non-redundant protein--protein binding affinity benchmark dataset. A maximum…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · vaccines and immunoinformatics approaches · Machine Learning in Bioinformatics
