A multitask transfer learning framework for the prediction of virus-human protein-protein interactions
Thi Ngan Dong, Graham Brogden, Gisa Gerold, Megha Khosla

TL;DR
This paper introduces a multitask transfer learning framework that leverages large-scale protein data and domain knowledge to predict virus-human protein-protein interactions, addressing data scarcity and mutation challenges.
Contribution
The novel approach combines deep language models for protein representation with multitask learning to improve virus-human PPI prediction accuracy.
Findings
Achieved competitive results on 13 benchmark datasets.
Effective in predicting both virus-human and bacteria-human PPIs.
Utilizes domain knowledge as a regularizer to enhance model performance.
Abstract
Viral infections are causing significant morbidity and mortality worldwide. Understanding the interaction patterns between a particular virus and human proteins plays a crucial role in unveiling the underlying mechanism of viral infection and pathogenesis. This could further help in the prevention and treatment of virus-related diseases. However, the task of predicting protein-protein interactions between a new virus and human cells is extremely challenging due to scarce data on virus-human interactions and fast mutation rates of most viruses. We developed a multitask transfer learning approach that exploits the information of around 24 million protein sequences and the interaction patterns from the human interactome to counter the problem of small training datasets. Instead of using hand-crafted protein features, we utilize statistically rich protein representations learned by a deep…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
