TL;DR
This paper demonstrates that pairwise coevolutionary models, inferred via Boltzmann-machine learning, effectively capture the complex collective residue variability in proteins, challenging the need for more complex models.
Contribution
The study provides a systematic analysis showing pairwise models are sufficient for modeling residue variability, supported by a new inference scheme and structural insights.
Findings
Correlations are built through multiple coupling paths based on 3D structure
Pairwise models accurately predict three-residue correlations
Models capture the structure of protein families in sequence space
Abstract
Global coevolutionary models of homologous protein families, as constructed by direct coupling analysis (DCA), have recently gained popularity in particular due to their capacity to accurately predict residue-residue contacts from sequence information alone, and thereby to facilitate tertiary and quaternary protein structure prediction. More recently, they have also been used to predict fitness effects of amino-acid substitutions in proteins, and to predict evolutionary conserved protein-protein interactions. These models are based on two currently unjustified hypotheses: (a) correlations in the amino-acid usage of different positions are resulting collectively from networks of direct couplings; and (b) pairwise couplings are sufficient to capture the amino-acid variability. Here we propose a highly precise inference scheme based on Boltzmann-machine learning, which allows us to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
