Transformer-based de novo peptide sequencing for data-independent acquisition mass spectrometry
Shiva Ebrahimi, Xuan Guo

TL;DR
This paper introduces DiaTrans, a transformer-based deep learning model that significantly improves de novo peptide sequencing from DIA mass spectrometry data, addressing the multiplexing challenge and outperforming existing methods.
Contribution
The paper presents DiaTrans, a novel transformer-based model for de novo peptide sequencing from DIA data, achieving substantial accuracy improvements over prior methods.
Findings
Casanovo-DIA improves precision by up to 34.8%.
Recall increases by up to 31.94%.
Peptide-level precision reaches 81.36%.
Abstract
Tandem mass spectrometry (MS/MS) stands as the predominant high-throughput technique for comprehensively analyzing protein content within biological samples. This methodology is a cornerstone driving the advancement of proteomics. In recent years, substantial strides have been made in Data-Independent Acquisition (DIA) strategies, facilitating impartial and non-targeted fragmentation of precursor ions. The DIA-generated MS/MS spectra present a formidable obstacle due to their inherent high multiplexing nature. Each spectrum encapsulates fragmented product ions originating from multiple precursor peptides. This intricacy poses a particularly acute challenge in de novo peptide/protein sequencing, where current methods are ill-equipped to address the multiplexing conundrum. In this paper, we introduce DiaTrans, a deep-learning model based on transformer architecture. It deciphers peptide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Proteomics Techniques and Applications
MethodsFragmentation
