Hiking in the energy landscape in sequence space: a bumpy road to good folders
G. Tiana, R. A. Broglia, E. I. Shakhnovich

TL;DR
This study explores the energy landscape of heteropolymer sequences, revealing clustered low-energy sequences, conserved key residues, and the existence of super-clusters, with implications for understanding protein foldability and sequence diversity.
Contribution
It introduces a lattice model analysis showing how sequence clusters and super-clusters form in the energy landscape, highlighting conserved key residues and their relation to foldability.
Findings
Low-energy sequences form interconnected clusters.
Key residues are highly conserved within clusters.
Super-clusters emerge from degenerate attractive interactions.
Abstract
With the help of a simple 20 letters, lattice model of heteropolymers, we investigate the energy landscape in the space of designed good-folder sequences. Low-energy sequences form clusters, interconnected via neutral networks, in the space of sequences. Residues which play a key role in the foldability of the chain and in the stability of the native state are highly conserved, even among the chains belonging to different clusters. If, according to the interaction matrix, some strong attractive interactions are almost degenerate (i.e. they can be realized by more than one type of aminoacid contacts) sequence clusters group into a few super-clusters. Sequences belonging to different super-clusters are dissimilar, displaying very small () similarity, and residues in key-sites are, as a rule, not conserved. Similar behavior is observed in the analysis of real protein sequences.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems
