Towards Less Biased Data-driven Scoring with Deep Learning-Based End-to-end Database Search in Tandem Mass Spectrometry
Yonghan Yu, Ming Li

TL;DR
DeepSearch introduces a novel deep learning-based end-to-end approach for peptide identification in mass spectrometry, reducing bias and eliminating the need for statistical estimation, with robust performance across diverse datasets.
Contribution
It is the first to apply a transformer-based deep learning model for end-to-end peptide spectrum matching, profiling modifications in a zero-shot manner, and reducing bias in scoring.
Findings
DeepSearch outperforms traditional methods in accuracy.
It profiles post-translational modifications without prior training.
The scoring scheme shows less bias and no statistical estimation needed.
Abstract
Peptide identification in mass spectrometry-based proteomics is crucial for understanding protein function and dynamics. Traditional database search methods, though widely used, rely on heuristic scoring functions and statistical estimations have to be introduced for a higher identification rate. Here, we introduce DeepSearch, the first deep learning-based end-to-end database search method for tandem mass spectrometry. DeepSearch leverages a modified transformer-based encoder-decoder architecture under the contrastive learning framework. Unlike conventional methods that rely on ion-to-ion matching, DeepSearch adopts a data-driven approach to score peptide spectrum matches. DeepSearch is also the first deep learning-based method that can profile variable post-translational modifications in a zero-shot manner. We showed that DeepSearch's scoring scheme expressed less bias and did not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMetabolomics and Mass Spectrometry Studies · Mass Spectrometry Techniques and Applications · Advanced Proteomics Techniques and Applications
MethodsContrastive Learning
