Combining predictors of natively unfolded proteins to detect a twilight zone between order and disorder in generic datasets
Antonio Deiana, Andrea Giansanti

TL;DR
This paper introduces and evaluates new computational indexes for distinguishing natively unfolded proteins from folded ones, especially focusing on proteins in the ambiguous twilight zone, and proposes a consensus scoring method to improve classification accuracy.
Contribution
It presents a novel global index gVSL2, combines multiple indexes into a unanimous score SSU, and characterizes proteins in the ambiguous zone with intermediate properties.
Findings
gVSL2 and Poodle-W outperform other indexes across datasets
The SSU score effectively identifies proteins in the twilight zone
Unclassified proteins have intermediate structural and evolutionary properties
Abstract
Natively unfolded proteins lack a well defined three dimensional structure but have important biological functions, suggesting a re-assignment of the structure-function paradigm. Many proteins have amino acidic compositions compatible both with the folded and unfolded status, and belong to a twilight zone between order and disorder. This makes difficult a dichotomic classification of protein sequences into folded and natively unfolded ones. In this methodological paper dichotomic folding indexes are considered: hydrophobicity-charge, mean packing, mean pairwise energy, Poodle-W and a new global index, that is called here gVSL2, based on the local disorder predictor VSL2. The performance of these indexes is evaluated on different datasets. Poodle-W, gVSL2 and mean pairwise energy have good performance and stability in all the datasets considered and are combined into a strictly unanimous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Machine Learning in Bioinformatics · RNA and protein synthesis mechanisms
