Beware of so-called 'good' correlations: a statistical reality check on individual mRNA-protein predictions
Romain-Daniel Gosselin

TL;DR
This paper critically examines the limitations of using correlation coefficients between mRNA and protein levels to infer protein function, emphasizing the variability and potential misinterpretations in transcriptomics-proteomics studies.
Contribution
It provides a statistical analysis highlighting the variability and limitations of mRNA-protein correlations, cautioning against overreliance on correlation for functional inference.
Findings
Correlation coefficients are incomplete indicators of protein levels.
A significant proportion of mRNA-protein pairs show opposite abundance trends.
Correlation alone is insufficient for inferring protein function from mRNA data.
Abstract
Research in the life sciences often employs messenger ribonucleic acids (mRNA) quantification as a standalone approach for functional analysis. However, although the correlation between the measured levels of mRNA and proteins is positive, correlation coefficients observed empirically are incomplete, necessitating caution in making agnostic inferences. This essay provides a statistical reflection and caveat on the concept of correlation strength in the context of transcriptomics-proteomics studies. It highlights the variability in possible protein levels at given empirical correlation values, even for precise mRNA amount, and underscores the notable proportion of mRNA-protein pairs with abundances at opposite ends of their respective distributions. Cell biologists, data scientists, and biostatisticians should recognise that mRNA-protein correlation alone is insufficient to justify using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA Research and Splicing · RNA and protein synthesis mechanisms · Molecular Biology Techniques and Applications
