Modeling sequence-space exploration and emergence of epistatic signals   in protein evolution

Matteo Bisardi; Juan Rodriguez-Rivas; Francesco Zamponi; Martin Weigt

arXiv:2106.02441·q-bio.BM·January 28, 2022

Modeling sequence-space exploration and emergence of epistatic signals in protein evolution

Matteo Bisardi, Juan Rodriguez-Rivas, Francesco Zamponi, Martin Weigt

PDF

1 Repo

TL;DR

This paper introduces stochastic models of protein evolution that predict features of experimental sequence libraries, enabling analysis and optimization of evolutionary experiments to detect epistatic signals and infer protein structure.

Contribution

The authors develop data-driven fitness landscape models that simulate protein evolution, providing a quantitative framework to analyze epistasis and optimize experimental design.

Findings

01

Models accurately predict fitness distributions and mutational spectra.

02

Large, diverged libraries are necessary to detect epistatic signals.

03

Framework can forecast experimental outcomes and guide protocol optimization.

Abstract

During their evolution, proteins explore sequence space via an interplay between random mutations and phenotypic selection. Here we build upon recent progress in reconstructing data-driven fitness landscapes for families of homologous proteins, to propose stochastic models of experimental protein evolution. These models predict quantitatively important features of experimentally evolved sequence libraries, like fitness distributions and position-specific mutational spectra. They also allow us to efficiently simulate sequence libraries for a vast array of combinations of experimental parameters like sequence divergence, selection strength and library size. We showcase the potential of the approach in re-analyzing two recent experiments to determine protein structure from signals of epistasis emerging in experimental sequence libraries. To be detectable, these signals require sufficiently…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

matteobisardi/SeqEvol
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.