Running PeptideProphet Separately on Replicates Improves Peptide Identification Results
Chao Yang, Zengyou He, Weichuan Yu

TL;DR
Running PeptideProphet separately on each replicate and then combining results enhances peptide identification accuracy in shotgun proteomics, leveraging the Bagging principle for better spectrum coverage and statistical power.
Contribution
This paper introduces a novel approach of applying PeptideProphet separately on replicates and combining results, improving peptide identification over the traditional merging method.
Findings
Consistent improvement on standard protein dataset
Enhanced results on Human and Yeast datasets
Demonstrates effectiveness of replicate-specific analysis
Abstract
Limited spectrum coverage is a problem in shotgun proteomics. Replicates are generated to improve the spectrum coverage. When integrating peptide identification results obtained from replicates, the state-of-the-art algorithm PeptideProphet combines Peptide-Spectrum Matches (PSMs) before building the statistical model to calculate peptide probabilities. In this paper, we find the connection between merging results of replicates and Bagging, which is a standard routine to improve the power of statistical methods. Following Bagging's philosophy, we propose to run PeptideProphet separately on each replicate and combine the outputs to obtain the final peptide probabilities. In our experiments, we show that the proposed routine can improve PeptideProphet consistently on a standard protein dataset, a Human dataset and a Yeast dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Proteomics Techniques and Applications · Machine Learning in Bioinformatics · Gene expression and cancer classification
