Uncovering protein interaction in abstracts and text using a novel linear model and word proximity networks
Alaa Abi-Haidar, Jasleen Kaur, Ana G. Maguitman, Predrag Radivojac,, Andreas Retchsteiner, Karin Verspoor, Zhiping Wang, Luis M. Rocha

TL;DR
This paper presents a lightweight linear model and word-proximity networks to identify protein interactions in abstracts and full texts, achieving high accuracy and recall in BioCreative Challenge tasks.
Contribution
The paper introduces a novel, simple linear model for abstract classification and a feature expansion method using word-proximity networks for full text analysis, demonstrating competitive performance.
Findings
Top performance in abstract relevance classification
High recall and mean reciprocal rank in full text tasks
Effective feature expansion with word-proximity networks
Abstract
We participated in three of the protein-protein interaction subtasks of the Second BioCreative Challenge: classification of abstracts relevant for protein-protein interaction (IAS), discovery of protein pairs (IPS) and text passages characterizing protein interaction (ISS) in full text documents. We approached the abstract classification task with a novel, lightweight linear model inspired by spam-detection techniques, as well as an uncertainty-based integration scheme. We also used a Support Vector Machine and the Singular Value Decomposition on the same features for comparison purposes. Our approach to the full text subtasks (protein pair and passage identification) includes a feature expansion method based on word-proximity networks. Our approach to the abstract classification task (IAS) was among the top submissions for this task in terms of the measures of performance used in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
