Evaluating Protein-protein Interaction Predictors with a Novel 3-Dimensional Metric
Haohan Wang, Madhavi K. Ganapathiraju

TL;DR
This paper introduces a new 3D evaluation metric for protein-protein interaction predictors that better reflects biological adoption by focusing on new interaction prediction and dataset independence.
Contribution
The work proposes a novel 3D metric that evaluates the ability of models to predict new interactions, overcoming limitations of traditional metrics like ROC and precision-recall curves.
Findings
The new metric effectively evaluates models' ability to predict novel interactions.
Traditional metrics like ROC and precision-recall fail in this evaluation context.
The proposed metric is dataset-independent and aligns with biological adoption needs.
Abstract
In order for the predicted interactions to be directly adopted by biologists, the ma- chine learning predictions have to be of high precision, regardless of recall. This aspect cannot be evaluated or numerically represented well by traditional metrics like accuracy, ROC, or precision-recall curve. In this work, we start from the alignment in sensitivity of ROC and recall of precision-recall curve, and propose an evaluation metric focusing on the ability of a model to be adopted by biologists. This metric evaluates the ability of a machine learning algorithm to predict only new interactions, meanwhile, it eliminates the influence of test dataset. In the experiment of evaluating different classifiers with a same data set and evaluating the same predictor with different datasets, our new metric fulfills the evaluation task of our interest while two widely recognized metrics, ROC and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Statistical Methods in Clinical Trials · Imbalanced Data Classification Techniques
