Protein (Multi-)Location Prediction: Using Location Inter-Dependencies in a Probabilistic Framework
Ramanuja Simha, Hagit Shatkay

TL;DR
This paper introduces a probabilistic framework using Bayesian network classifiers to improve the prediction of multiple protein locations by directly modeling inter-dependencies among locations, outperforming previous methods.
Contribution
The paper presents a novel method that incorporates location inter-dependencies into multi-location protein prediction, moving beyond independent or limited combination-based approaches.
Findings
Incorporating inter-dependencies significantly improves prediction accuracy.
The system performs comparably to top methods like YLoc+ on multi-localized proteins.
The approach allows predictions beyond training set location combinations.
Abstract
Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins, assuming that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems have attempted to predict multiple locations of proteins, they typically treat locations as independent or capture inter-dependencies by treating each locations-combination present in the training set as an individual location-class. We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the multiple-location-prediction process, using a collection of Bayesian network classifiers. We evaluate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Computational Drug Discovery Methods · Protein Structure and Dynamics
