TL;DR
ProtRank introduces a novel approach for differential expression analysis in proteomics that directly accounts for missing data without imputation, maintaining robustness and comparable results to existing methods.
Contribution
It presents a new ranking-based method that bypasses the need for imputation of missing values in proteomic differential expression analysis.
Findings
Robust to missing data in proteomic datasets
Produces results similar to edgeR, a state-of-the-art method
Available as an easy-to-use Python package
Abstract
Data from discovery proteomic and phosphoproteomic experiments typically include missing values that correspond to proteins that have not been identified in the analyzed sample. Replacing the missing values with random numbers, a process known as "imputation", avoids apparent infinite fold-change values. However, the procedure comes at a cost: Imputing a large number of missing values has the potential to significantly impact the results of the subsequent differential expression analysis. We propose a method that identifies differentially expressed proteins by ranking their observed changes with respect to the changes observed for other proteins. Missing values are taken into account by this method directly, without the need to impute them. We illustrate the performance of the new method on two distinct datasets and show that it is robust to missing values and, at the same time,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
