Assigning function to protein-protein interactions: a weakly supervised BioBERT based approach using PubMed abstracts
Aparna Elangovan, Melissa Davis, Karin Verspoor

TL;DR
This paper presents a weakly supervised deep learning approach using BioBERT to extract and annotate protein-protein interaction functions from PubMed abstracts, significantly expanding functional annotations in biological databases.
Contribution
The study introduces PPI-BioBERT, a novel ensemble deep learning model that leverages biomedical text to identify PPI functions at large scale with improved accuracy.
Findings
Identified 3253 new typed PPIs from PubMed abstracts.
Achieved 46% overall precision, with 87% for acetylation interactions.
Demonstrated feasibility of large-scale PPI function annotation from biomedical literature.
Abstract
Motivation: Protein-protein interactions (PPI) are critical to the function of proteins in both normal and diseased cells, and many critical protein functions are mediated by interactions.Knowledge of the nature of these interactions is important for the construction of networks to analyse biological data. However, only a small percentage of PPIs captured in protein interaction databases have annotations of function available, e.g. only 4% of PPI are functionally annotated in the IntAct database. Here, we aim to label the function type of PPIs by extracting relationships described in PubMed abstracts. Method: We create a weakly supervised dataset from the IntAct PPI database containing interacting protein pairs with annotated function and associated abstracts from the PubMed database. We apply a state-of-the-art deep learning technique for biomedical natural language processing tasks,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Machine Learning in Bioinformatics · Bioinformatics and Genomic Networks
