Efficient online learning for large-scale peptide identification
Xijun Liang, Zhonghang Xia, Yongxiang Wang, Ling Jian, Xinnan Niu,, Andrew Link

TL;DR
This paper introduces OLCS-Ranker, an online learning algorithm that improves peptide identification accuracy and efficiency on large-scale, challenging datasets by using cost-sensitive learning and iterative training.
Contribution
It presents a novel online learning method with cost-sensitive loss functions for peptide identification, reducing memory use and increasing speed on large datasets.
Findings
OLCS-Ranker outperforms benchmark methods in accuracy and stability.
OLCS-Ranker is 15-85 times faster than CRanker on large datasets.
The method effectively reduces false discovery rates on hard datasets.
Abstract
Motivation: Post-database searching is a key procedure in peptide dentification with tandem mass spectrometry (MS/MS) strategies for refining peptide-spectrum matches (PSMs) generated by database search engines. Although many statistical and machine learning-based methods have been developed to improve the accuracy of peptide identification, the challenge remains on large-scale datasets and datasets with an extremely large proportion of false positives (hard datasets). A more efficient learning strategy is required for improving the performance of peptide identification on challenging datasets. Results: In this work, we present an online learning method to conquer the challenges remained for exiting peptide identification algorithms. We propose a cost-sensitive learning model by using different loss functions for decoy and target PSMs respectively. A larger penalty for wrongly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Proteomics Techniques and Applications · Machine Learning in Bioinformatics · vaccines and immunoinformatics approaches
