Modeling Protein Evolution via Generative Inference From Monte Carlo Chains to Population Genetics
Leonardo Di Bari, Thierry Mora, Andrea Pagnani, Aleksandra M. Walczak, Francesco Zamponi, and Saverio Rossi

TL;DR
This paper compares three simulation methods for modeling protein evolution using generative models, finding that population genetics schemes best capture evolutionary dynamics and phylogenetic correlations.
Contribution
It introduces a comparative analysis of simulation schemes for protein evolution, highlighting the effectiveness of population genetics models in capturing complex evolutionary features.
Findings
Population genetics models accurately reproduce phylogenetic correlations.
Standard Monte Carlo fails to capture realistic evolutionary trajectories.
Tree-based Monte Carlo improves phylogenetic fidelity.
Abstract
Generative models derived from large protein sequence alignments define complex fitness landscapes, but their utility for accurately modeling non-equilibrium evolutionary dynamics remains unclear. In this work, we perform a rigorous comparative analysis of three simulation schemes, designed to mimic evolution in silico by local sampling of the probability distribution defined by a generative model. We compare standard independent Markov Chain Monte Carlo, Monte Carlo on a phylogenetic tree, and a population genetics dynamics, benchmarking their outputs against deep sequencing data from four distinct in vitro evolution experiments. We find that standard Monte Carlo fails to reproduce the correct phylogenetic structure and generates unrealistic, gradual mutational sweeps. Performing Monte Carlo on a tree inferred from data improves phylogenetic fidelity and historical accuracy. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolution and Genetic Dynamics · Genomics and Phylogenetic Studies · Evolution and Paleontology Studies
