# MsTargetPeaker: A Quality-Aware Deep Reinforcement Learning Approach for Peak Identification in Targeted Proteomics

**Authors:** Chi Yang, Yung-Chin Hsiao, Chi-Ching Lee, Lichieh Julie Chu, Ta-Sen Yeh, Ping-Chang Cheng, Petrus Tang, Jau-Song Yu

PMC · DOI: 10.1016/j.mcpro.2026.101523 · 2026-02-02

## TL;DR

MsTargetPeaker improves automated peak identification in targeted proteomics by using reinforcement learning and dynamic quality assessment to enhance accuracy and interpretability.

## Contribution

Introduces MsTargetPeaker, a quality-aware deep reinforcement learning method for peak boundary identification in targeted proteomics.

## Key findings

- MsTargetPeaker improves agreement with manual reference boundaries and peak area ratio correlations.
- The method uses a custom seven-component reward function to dynamically assess and optimize peak quality.
- Diagnostic reports generated by MsTargetPeaker enable efficient quality control of peak groups.

## Abstract

Targeted mass spectrometry enables precise peptide quantification by identifying high-quality chromatographic peaks for area integration. Automated peak identification remains challenging, particularly for low-abundance targets, because of interference and noise. Existing approaches typically rely on two supervised learning models, one for selecting peak regions and the other for performing downstream quality control in a separate postprocessing step. However, deferring quality assessment to a separate stage may limit the ability to refine peak boundaries in pursuit of improved quality, as the initial selection is performed without explicit awareness of quality-related criteria. In this study, we present MsTargetPeaker, a quality-aware search procedure for identifying peak regions in targeted proteomics data. The method employs a reinforcement learning agent to guide Monte Carlo tree search to efficiently explore chromatograms and localize target peaks while minimizing interference. Peak quality is dynamically assessed during the search via a custom-designed reward function, which prioritizes regions with desirable peak characteristics and enables accurate and robust boundary determination. The reward function further incorporates cross-sample consensus profiles of candidate boundaries to improve the identification of low-quality or ambiguous signals. These innovations support fine-grained peak identification, enhancing both peak quality and quantification precision. In addition, the transparent reward calculation allows MsTargetPeaker to generate interpretable diagnostic quality reports, providing comprehensive metrics across transitions, peak groups, and sample replicates. This facilitates efficient detection of problematic cases for manual curation. Collectively, MsTargetPeaker offers a practical advancement toward robust and automated peak identification in targeted proteomics.

•A reinforcement learning agent is trained for multiple reaction monitoring/parallel reaction mnitoring peak boundary picking.•Peak quality is assessed dynamically and optimized by Monte Carlo tree search.•Cross-sample consensus profiles guide consistent boundary selection across runs.•MsTargetPeaker is accurate across dilution ratios and test datasets.•Interpretable diagnostic reports enable efficient quality control of peak groups.

A reinforcement learning agent is trained for multiple reaction monitoring/parallel reaction mnitoring peak boundary picking.

Peak quality is assessed dynamically and optimized by Monte Carlo tree search.

Cross-sample consensus profiles guide consistent boundary selection across runs.

MsTargetPeaker is accurate across dilution ratios and test datasets.

Interpretable diagnostic reports enable efficient quality control of peak groups.

Accurate chromatographic peak boundaries are essential for targeted proteomics but remain difficult to automate for low-abundance or ambiguous signals. MsTargetPeaker performs quality-aware peak identification by combining a deep reinforcement learning agent with Monte Carlo tree search. Peak quality is evaluated during inference using a custom seven-component reward function. Across a response-curve dataset and nine external datasets, MsTargetPeaker increased agreement with manual reference boundaries and improved peak area ratio correlations with manual reference peaks. Transparent scoring also yields diagnostic reports for efficient review and quality control.

## Full-text entities

- **Genes:** SFTPC (surfactant protein C) [NCBI Gene 6440] {aka BRICD6, PSP-C, SFTP2, SMDP2, SP-C}, CRYGD (crystallin gamma D) [NCBI Gene 1421] {aka CACA, CCA3, CCP, CRYG4, CTRCT4, PCC}
- **Diseases:** MCTS (MESH:D021184)
- **Chemicals:** DeepMRM (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12966724/full.md

---
Source: https://tomesphere.com/paper/PMC12966724