# Investigating Statistical Conditions of Coevolutionary Signals that Enable Algorithmic Predictions of Protein Partners

**Authors:** José Fiorote, João Alves, Letícia Stock, Werner Treptow

PMC · DOI: 10.1021/acs.jcim.5c00052 · 2025-04-15

## TL;DR

This paper explores how coevolutionary signals in protein sequences can be used to predict protein partners without relying on 3D structures.

## Contribution

The study introduces a Markov stochastic model to predict protein partners using coevolutionary information and sequence data.

## Key findings

- Algorithmic predictions of protein partners struggle when sequence numbers exceed 100.
- Ignoring mismatches in similar sequences improves true-positive prediction rates.
- The model distinguishes optimized solutions from degenerate ones using coevolutionary parameters.

## Abstract

This study examines
the statistical conditions of coevolutionary
signals that allow algorithmic predictions of protein partners based
on amino acid sequences rather than 3D structures. It introduces a
Markov stochastic model that predicts the number of correct protein
partners based on coevolutionary information. The model defines state
probabilities using a Poisson mixture of normal distributions, with
key parameters including the total number of protein sequences M, the coevolutionary information gap α, and variance
σ02. The
model suggests that algorithmic approaches that maximize coevolutionary
information cannot effectively resolve partners in protein families
with a large number of sequences M ≥ 100.
The model shows that true-positive (TP) rates can be enhanced by disregarding
mismatches among similar sequences. This approach allows a distinction,
in terms of {α, σ02}, between optimized solutions with trivial
errors and other degenerate solutions. Our findings enable the a priori
classification of protein families where partners can be reliably
predicted by ignoring trivial errors between similar sequences, advancing
the understanding of coevolutionary models for large protein data
sets.

## Full-text entities

- **Genes:** MYOM2 (myomesin 2) [NCBI Gene 9172] {aka TTNAP}
- **Chemicals:** TP (-), amino acids (MESH:D000596)

## Figures

23 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12042258/full.md

---
Source: https://tomesphere.com/paper/PMC12042258