# Scaling the profile of life by function with SPIN

**Authors:** Andrea Mancini, Vinh-Son Pho, Alessandro Bianchi, Gianluca Lombardi, Chujun Lyu, Alessandra Carbone

PMC · DOI: 10.1093/bioadv/vbag064 · Bioinformatics Advances · 2026-02-19

## TL;DR

SPIN is a deep learning method that efficiently classifies large numbers of protein sequences by function, improving scalability and understanding of protein evolution.

## Contribution

SPIN introduces a scalable deep learning approach for large-scale protein function classification with linear time complexity.

## Key findings

- SPIN classifies hundreds of thousands of protein sequences into functional classes efficiently.
- SPIN identifies family-specific conserved residues, revealing functional nuances in protein subclasses.
- SPIN balances accuracy and computational cost compared to family-specific models.

## Abstract

Classifying hundreds of thousands of protein sequences by function remains a significant computational challenge. Building on the ProfileView method for identifying functional classes and subclasses, our goal is to achieve large-scale classification of proteins from extensive databases and ongoing high-throughput sequencing efforts, ultimately producing comprehensive sets of sequences that share the same function.

By applying deep learning techniques, SPIN learns discriminative patterns in functionally related sequences, allowing the classification of hundreds of thousands of sequences into a defined number of functional classes. SPIN offers an effective compromise between small, family-specific protein language models (pLMs) and computational cost, with a time complexity linear in the number of sequences. It enables the identification of family-specific conserved residues, providing insight into the functional nuances of protein subclasses. By enhancing the scalability of protein function predictors, SPIN advances our understanding of protein functions and their evolutionary relationships.

The data and code that support the findings of this study are publicly available at https://gitlab.lcqb.upmc.fr/andrea.mancini/SPIN.

## Full-text entities

- **Genes:** SRC (SRC proto-oncogene, non-receptor tyrosine kinase) [NCBI Gene 6714] {aka ASV, SRC1, THC6, c-SRC, p60-Src}, SPIN1 (spindlin 1) [NCBI Gene 10927] {aka SPIN, TDRD24}
- **Diseases:** DL (MESH:C537113)
- **Chemicals:** asparagines (MESH:D001216), lysines (MESH:D008239), amino acid (MESH:D000596), ESM2-35M (-), disulfide (MESH:D004220)
- **Species:** Chlamydomonas reinhardtii (species) [taxon 3055], Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12970593/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12970593/full.md

## References

40 references — full list in the complete paper: https://tomesphere.com/paper/PMC12970593/full.md

---
Source: https://tomesphere.com/paper/PMC12970593