Capturing coevolutionary signals in repeat proteins
Roc\'io Espada, R. Gonzalo Parra, Thierry Mora, Aleksandra M. Walczak,, and Diego Ferreiro

TL;DR
This paper develops a statistical method to detect true co-evolutionary signals in repeat proteins by correcting biases caused by their translational symmetry, enabling better identification of native contacts.
Contribution
It introduces a bias correction technique for statistical coupling analysis in repeat proteins, improving the detection of genuine co-evolutionary signals.
Findings
Bias correction reveals true co-evolutionary signals
Method identifies native contacts in repeat proteins
Minimum sequence number needed for reliable analysis
Abstract
The analysis of correlations of amino acid occurrences in globular proteins has led to the development of statistical tools that can identify native contacts -- portions of the chains that come to close distance in folded structural ensembles. Here we introduce a statistical coupling analysis for repeat proteins -- natural systems for which the identification of domains remains challenging. We show that the inherent translational symmetry of repeat protein sequences introduces a strong bias in the pair correlations at precisely the length scale of the repeat-unit. Equalizing for this bias reveals true co-evolutionary signals from which local native-contacts can be identified. Importantly, parameter values obtained for all other interactions are not significantly affected by the equalization. We quantify the robustness of the procedure and assign confidence levels to the interactions,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · RNA and protein synthesis mechanisms · Genetic Neurodegenerative Diseases
