# Spectral Cruncher: A Visualization Tool Integrating Manual Curation, Ion-Intensity Prediction, and De Novo Tag Generation

**Authors:** Aline A. M. Martins, Blake L. Tsu, Hulyana Brum, Lucas Sales, Marlon Dias Mariano dos Santos, Juliana de Saldanha da Gama Fischer, Stephanie Almeida, Luisa Bulcao Vieira Coelho, Natalia Moreira, Alysson R. Muotri, Paulo Costa Carvalho

PMC · DOI: 10.1021/jasms.5c00301 · Journal of the American Society for Mass Spectrometry · 2025-12-29

## TL;DR

Spectral Cruncher is a new tool that helps scientists analyze protein mass spectra by combining manual work with advanced predictions and visualizations.

## Contribution

The novel contribution is the integration of a transformer-based ion-intensity predictor (SpecFormer) with manual curation and de novo tag generation in a unified proteomics platform.

## Key findings

- SpecFormer achieves high predictive accuracy with cosine similarities of 0.98 for Q-Exactive + data, 0.91 for bulk Astral, and 0.87 for Astral single-cell data.
- The tool supports interactive analysis of ambiguous spectra and validation of peptide identifications in a unified graphical environment.
- Spectral Cruncher is freely available in PatternLab 5.1, promoting expert-driven workflows and learning.

## Abstract

Here,
we introduce Spectral Cruncher, an interactive extension
to the PatternLab for Proteomics platform, designed to bridge the
gap between manual curation and state-of-the-art computational analysis
of proteomic tandem mass spectra. Spectral Cruncher integrates de
novo sequence tag extraction, automated spectral annotation, targeted
tag search, and a customized transformer-based fragment-ion intensity
predictor (SpecFormer) within a unified graphical environment, designed
for interactive and instrument-specific visualization. Central to
this workflow is SpecFormer, a compact transformer architecture trained
on multiple data sets, providing independent ion intensity models
for Q-Exactive + bulk, Astral bulk, and Astral single-cell proteomics
data, enabling accurate and instrument-specific intensity prediction
even under conditions of sparse fragmentation and low signal-to-noise
ratios. Evaluation of SpecFormer demonstrates high predictive performance,
with average cosine similarities of approximately 0.98 for bulk Q-Exactive
+ data sets, 0.91 for bulk Astral, and 0.87 for Astral single-cell
data. These tools enable researchers to interrogate ambiguous spectra,
validate peptide identifications, and develop intuition for algorithmic
limitations. The tools are freely available within PatternLab 5.1,
lowering technical barriers and promoting broader adoption of interactive,
expert-driven workflows as well as providing a learning environment.
A video of our tool in action is available at https://youtu.be/tc2sPiqJkLA.

## Full-text entities

- **Diseases:** Glioblastoma (MESH:D005909)
- **Chemicals:** H+ (MESH:D006859), H2O (MESH:D014867), Asn (MESH:D001216), Ser (MESH:D012694), Lys (MESH:D008239), Thr (MESH:D013912), Arg (MESH:D001120), Asp (MESH:D001224), NH3 (MESH:D000641), Gln (MESH:D005973), cysteine (MESH:D003545), amino acid (MESH:D000596), Glu (MESH:D018698), dipeptide (MESH:D004151)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090]
- **Cell lines:** HeLa — Homo sapiens (Human), Human papillomavirus-related endocervical adenocarcinoma, Cancer cell line (CVCL_0030), C57BL/6 — Mus musculus (Mouse), Transformed cell line (CVCL_C0MU)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12879931/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12879931/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/PMC12879931/full.md

---
Source: https://tomesphere.com/paper/PMC12879931