Selecting between-sample RNA-Seq normalization methods from the perspective of their assumptions
Ciaran Evans, Johanna Hardin, Daniel Stoebel

TL;DR
This paper emphasizes the importance of understanding the assumptions behind RNA-Seq normalization methods, showing how assumption violations can lead to inaccurate gene expression measures and affect downstream analyses.
Contribution
It provides a detailed analysis of normalization methods based on their assumptions, guiding researchers in selecting appropriate methods for their data.
Findings
Assumption validity is crucial for normalization accuracy.
Violations of assumptions lead to poor normalization performance.
Understanding assumptions improves downstream analysis reliability.
Abstract
RNA-Seq is a widely-used method for studying the behavior of genes under different biological conditions. An essential step in an RNA-Seq study is normalization, in which raw data are adjusted to account for factors that prevent direct comparison of expression measures. Errors in normalization can have a significant impact on downstream analysis, such as inflated false positives in differential expression analysis. An under-emphasized feature of normalization is the assumptions upon which the methods rely and how the validity of these assumptions can have a substantial impact on the performance of the methods. In this paper, we explain how assumptions provide the link between raw RNA-Seq read counts and meaningful measures of gene expression. We examine normalization methods from the perspective of their assumptions, as an understanding of methodological assumptions is necessary for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
