# Complete Data Analysis Workflow for Quantitative DIA Mass Spectrometry Using Nextflow

**Authors:** Mats Perk, Sami Pietilä, Tommi Välikangas, Balazs Balint, Tomi Suomi, Laura L. Elo

PMC · DOI: 10.1021/acs.jproteome.5c00266 · 2026-02-06

## TL;DR

This paper introduces glaDIAtor-nf, a Nextflow-based workflow for analyzing DIA mass spectrometry data, enabling efficient and accurate proteomics research.

## Contribution

The novel contribution is the development of glaDIAtor-nf, a streamlined and automated workflow for untargeted DIA proteomics analysis using Nextflow.

## Key findings

- glaDIAtor-nf was validated for technical accuracy using gold standard datasets.
- Reanalysis of public breast cancer data revealed previously hidden proteome patterns.
- The workflow highlights the importance of convenient tools for large-scale data reanalysis.

## Abstract

Data-independent
acquisition (DIA) mass spectrometry
is a technique
used in proteomics to identify and quantify proteins in complex biological
samples. While this comprehensive approach yields more complete and
reproducible protein profiles than data-independent acquisition (DDA),
the resulting data are substantially larger and more complex, presenting
significant challenges for data analysis and interpretation. These
challenges can be effectively addressed using dedicated workflow managers
that support parallel execution of complex analysis pipelines on high-performance
computing infrastructure. Nextflow, in particular, is well-suited
for streamlining data analysis, as it automates key aspects of workflow
management, allowing researchers to efficiently analyze large-scale
data sets with minimal manual intervention. Here, we describe glaDIAtor-nf,
a Nextflow version of our software package glaDIAtor for untargeted
analysis of DIA mass spectrometry proteomics data. We first demonstrate
its technical accuracy through rigorous testing on gold standard data
sets. Building on this, we then reveal known proteome patterns from
public breast cancer data that remained hidden in the processed data
of the original study. This illustrates the potential of reanalyzing
the accumulating public data, but also highlights the need for convenient
tools to facilitate such reanalysis in large-scale.

## Linked entities

- **Diseases:** breast cancer (MONDO:0004989)

## Full-text entities

- **Diseases:** breast cancer (MESH:D001943)

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12973368/full.md

---
Source: https://tomesphere.com/paper/PMC12973368