# Anticipating protein evolution with successor sequence predictor

**Authors:** Rayyan Tariq Khan, Pavel Kohout, Milos Musil, Monika Rosinska, Jiri Damborsky, Stanislav Mazurenko, David Bednar

PMC · DOI: 10.1186/s13321-025-00971-z · 2025-03-21

## TL;DR

The paper introduces a new computational tool called the Successor Sequence Predictor that predicts future protein evolution by analyzing past evolutionary trends and suggesting beneficial mutations.

## Contribution

The novel contribution is the development of a predictive in silico method that combines ancestral sequence reconstruction with physicochemical descriptors to forecast amino acid substitutions.

## Key findings

- The Successor Sequence Predictor can forecast mutations that improve protein properties like thermostability and solubility.
- The method reduces the need for resource-intensive experimental techniques like directed evolution.
- The tool is available as open-source code and a web server for practical use in protein engineering.

## Abstract

The quest to predict and understand protein evolution has been hindered by limitations on both the theoretical and the experimental fronts. Most existing theoretical models of evolution are descriptive, rather than predictive, leaving the final modifications in the hands of researchers. Existing experimental techniques to help probe the evolutionary sequence space of proteins, such as directed evolution, are resource-intensive and require specialised skills. We present the successor sequence predictor (SSP) as an innovative solution. Successor sequence predictor is an in silico protein design method that mimics laboratory-based protein evolution by reconstructing a protein's evolutionary history and suggesting future amino acid substitutions based on trends observed in that history through carefully selected physicochemical descriptors. This approach enhances specialised proteins by predicting mutations that improve desired properties, such as thermostability, activity, and solubility. Successor Sequence Predictor can thus be used as a general protein engineering tool to develop practically useful proteins. The code of the Successor Sequence Predictor is provided at https://github.com/loschmidt/successor-sequence-predictor, and the design of mutations will be also possible via an easy-to-use web server https://loschmidt.chemi.muni.cz/fireprotasr/.

The Successor Sequence Predictor advances protein evolution prediction at the amino acid level by integrating ancestral sequence reconstruction with a novel in silico approach that models evolutionary trends through selected physicochemical descriptors. Unlike prior work, SSP can forecast future amino acid substitutions that enhance protein properties such as thermostability, activity, and solubility. This method reduces reliance on resource-intensive directed evolution techniques while providing a generalizable, predictive tool for protein engineering.

The online version contains supplementary material available at 10.1186/s13321-025-00971-z.

## Full-text entities

- **Genes:** GZMB (granzyme B) [NCBI Gene 3002] {aka C11, CCPI, CGL-1, CGL1, CSP-B, CSPB}, ADPRH (ADP-ribosylarginine hydrolase) [NCBI Gene 141] {aka ARH1, hARH1}, SH2D1A (SH2 domain containing 1A) [NCBI Gene 4068] {aka DSHP, EBVS, IMD5, LYP, MTCP1, SAP}
- **Diseases:** SSP (MESH:D010855), Cold (MESH:D000067390), viral infection (MESH:D014777), infectious diseases (MESH:D003141), SIPS (MESH:D000081042), cancer (MESH:D009369)
- **Chemicals:** aminoglycosides (MESH:D000617), kanamycin (MESH:D007612), amino acid (MESH:D000596), paromomycin (MESH:D010303), neomycin (MESH:D009355), gentamicin B. (MESH:C023804), SSP (-), ribostamycin (MESH:D012271), butirosin (MESH:D002076)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Mutations:** V200A, D24N, N55K, K38Q, A46L, N55S, E50K, N55D, A46E, E66K, A46K

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11927200/full.md

---
Source: https://tomesphere.com/paper/PMC11927200