Methods for Secondary and Tertiary Structure Prediction of Microproteins
Julio C. Facelli

TL;DR
This paper reviews methods for predicting the secondary and tertiary structures of microproteins, highlighting current capabilities, limitations, and the need for more experimental data to improve accuracy.
Contribution
It compares three archetypical structure prediction methods for microproteins and evaluates their performance on 44 cases, emphasizing the need for more experimental structures.
Findings
Methods show reasonable agreement for microproteins
Prediction accuracy is lower than for larger proteins
More experimental structures are needed for better calibration
Abstract
Microproteins are a newly recognized and rapidly growing class of small proteins, typically encoded by fewer than 100 to 150 codons and translated from small open reading frames (smORFs). Although research has shown that smORFs and their corresponding microproteins constitute a significant portion of the genome and proteome, there is still limited information available in the literature regarding the structural characteristics of microproteins. In this paper, we discuss the methods available for predicting their secondary and tertiary structures and provide examples of calculations done with three archetypical methods (AlphaFold, I TASSER and ROSETTA). We present results predicting the structures of 44 microproteins. For this set of microproteins the methods considered here show a reasonable agreement among them and with the very few cases in which experimental structures are available.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · Machine Learning in Bioinformatics · Protein Structure and Dynamics
