# Integrating AlphaFold2 models and clinical data to improve the assessment of Short Linear Motifs (SLiMs) and their variants’ pathogenicity

**Authors:** Franco Gino Brunello, Lorenzo Erra, Juan Nicola, Marcelo Adrián Martí, Mohammad Sadegh Taghizadeh, Mohammad Sadegh Taghizadeh, Mohammad Sadegh Taghizadeh, Mohammad Sadegh Taghizadeh

PMC · DOI: 10.1371/journal.pcbi.1012829 · PLOS Computational Biology · 2025-08-04

## TL;DR

This paper improves the identification of protein regions called SLiMs and their disease-causing variants by using AlphaFold2 models and clinical data.

## Contribution

The study introduces an updated version of MotSASi that uses AlphaFold2 to overcome structural data limitations and improves variant deleteriousness prediction.

## Key findings

- AlphaFold2-derived structures reliably reproduce known SLiM-receptor complexes and capture variant effects.
- The updated MotSASi method identifies more high-confidence SLiMs and outperforms AlphaMissense in variant deleteriousness prediction.
- The approach aligns with ACMG/AMP standards and can improve clinical genomic diagnostics.

## Abstract

Short Linear Motifs (SLiMs) are protein functionally relevant regions that mediate reversible protein-protein interactions. Variants that disrupt SLiMs can lead to numerous Mendelian diseases. Although various bioinformatic tools have been developed to identify SLiMs, most suffer from low specificity. In our previous work, we demonstrated that integrating sequence variant information with structural analysis can enhance the prediction of true functional SLiMs while simultaneously generating tolerance matrices that indicate whether each of the 19 possible single amino acid substitutions (SASs) is tolerated. However, the scarcity of representative crystallographic structures of SLiM-receptor complexes posed a significant limitation. In this study, we demonstrate that these interactions can be modeled using AlphaFold2 (AF2) to generate high-quality structures that serve as input for our MotSASi method. These AF2-derived structures show robust performance, both in reproducing known structures deposited in the Protein Data Bank (PDB) and in reflecting the deleterious effects of known sequence variants. This updated version of MotSASi expands the repertoire of high-confidence predicted SLiMs and provides a comprehensive catalog of variants located within SLiMs, along with their respective deleteriousness assessments. When compared to AlphaMissense, MotSASi demonstrates superior performance in predicting variant deleteriousness. By contributing to the accurate identification and interpretation of variants, this work aligns with ACMG/AMP standards and aims to improve diagnostic rates in clinical genomics.

Proteins interact with each other in highly specific ways to carry out vital biological functions. Short Linear Motifs (SLiMs) are small regions within proteins that mediate many of these reversible interactions. Changes in SLiMs can disrupt these interactions and lead to severe genetic disorders. Identifying SLiMs accurately has been a longstanding challenge, as many computational tools suffer from low specificity. Previously, we developed a method, MotSASi, that combines sequence variation data and structural analysis to improve SLiM prediction and assess the impact of single amino acid substitutions (SASs). However, the lack of available structural data limited its application. In this study, we demonstrate that structures generated using AlphaFold2 (AF2) can overcome this limitation. These high-quality AF2 models reliably reproduce known structures and capture the harmful effects of sequence variations. By integrating AF2 models, the updated MotSASi method identifies more high-confidence SLiMs and provides detailed assessments of the variants within them. MotSASi outperforms existing tools, such as AlphaMissense, in predicting the impact of genetic variants, offering insights aligned with clinical standards. This advancement can aid in understanding disease mechanisms and improving genetic diagnostics in clinical genomics.

## Full-text entities

- **Genes:** KLHDC10 (kelch domain containing 10) [NCBI Gene 23008] {aka PNAS-138, slim}
- **Diseases:** Mendelian diseases (MESH:D030342)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12338786/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12338786/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/PMC12338786/full.md

---
Source: https://tomesphere.com/paper/PMC12338786