Faster graphical model identification of tandem mass spectra using peptide word lattices
Shengjie Wang, John T. Halloran, Jeff A. Bilmes, William S. Noble

TL;DR
This paper introduces a faster, more accurate machine learning method for peptide identification in shotgun proteomics by leveraging word lattices for computational efficiency and a discriminative training framework for improved statistical power.
Contribution
It presents a novel application of word lattices to accelerate DRIP and a discriminative training approach to enhance peptide-spectrum match accuracy.
Findings
Speedups of tens times in peptide identification
Increased number of spectrum identifications at 1% FDR
Improved statistical power over previous methods
Abstract
Liquid chromatography coupled with tandem mass spectrometry, also known as shotgun proteomics, is a widely-used high-throughput technology for identifying proteins in complex biological samples. Analysis of the tens of thousands of fragmentation spectra produced by a typical shotgun proteomics experiment begins by assigning to each observed spectrum the peptide hypothesized to be responsible for generating the spectrum, typically done by searching each spectrum against a database of peptides. We have recently described a machine learning method---Dynamic Bayesian Network for Rapid Identification of Peptides (DRIP)---that not only achieves state-of-the-art spectrum identification performance on a variety of datasets but also provides a trainable model capable of returning valuable auxiliary information regarding specific peptide-spectrum matches. In this work, we present two significant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Proteomics Techniques and Applications · Machine Learning in Bioinformatics · Mass Spectrometry Techniques and Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
