A Combinatorial Perspective of the Protein Inference Problem
Chao Yang, Zengyou He, and Weichuan Yu

TL;DR
This paper introduces a combinatorial approach to protein inference in shotgun proteomics, providing a closed-form model that improves efficiency and offers insights into peptide contributions, with competitive results against existing methods.
Contribution
The paper presents a novel combinatorial model for protein inference that yields closed-form solutions and enhances computational efficiency over previous methods like ProteinProphet.
Findings
The model provides bounds and empirical estimates for protein probabilities.
It demonstrates competitive accuracy with ProteinProphet.
The approach efficiently handles unique and degenerate peptides.
Abstract
In a shotgun proteomics experiment, proteins are the most biologically meaningful output. The success of proteomics studies depends on the ability to accurately and efficiently identify proteins. Many methods have been proposed to facilitate the identification of proteins from the results of peptide identification. However, the relationship between protein identification and peptide identification has not been thoroughly explained before. In this paper, we are devoted to a combinatorial perspective of the protein inference problem. We employ combinatorial mathematics to calculate the conditional protein probabilities (Protein probability means the probability that a protein is correctly identified) under three assumptions, which lead to a lower bound, an upper bound and an empirical estimation of protein probabilities, respectively. The combinatorial perspective enables us to obtain a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Advanced Proteomics Techniques and Applications · Algorithms and Data Compression
