Selection originating from protein stability/foldability: Relationships between protein folding free energy, sequence ensemble, and fitness
Sanzo Miyazawa

TL;DR
This paper establishes theoretical links between protein folding free energy, sequence ensemble distributions, and evolutionary fitness, providing a framework to estimate key parameters like selective temperature and mutation effects from sequence data.
Contribution
It introduces a unified theoretical framework connecting protein stability, sequence distributions, and evolutionary dynamics, with methods to estimate parameters from empirical data.
Findings
Equilibrium sequence distribution follows a Boltzmann distribution related to fitness and population sizes.
Protein folding free energy differences are equivalent to sequence ensemble potentials and fitness measures.
Estimated parameters like $T_s$ and $ abla G_{ND}$ align with observed mutation and selection patterns.
Abstract
Assuming that mutation and fixation processes are reversible Markov processes, we prove that the equilibrium ensemble of sequences obeys a Boltzmann distribution with , where is Malthusian fitness and and are effective and actual population sizes. On the other hand, the probability distribution of sequences with maximum entropy that satisfies a given amino acid composition at each site and a given pairwise amino acid frequency at each site pair is a Boltzmann distribution with , where is represented as the sum of one body and pairwise potentials. A protein folding theory indicates that homologous sequences obey a canonical ensemble characterized by or by if an amino acid composition is kept constant, where , and are the native…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
