A Unified Model for Differential Expression Analysis of RNA-seq Data via L1-Penalized Linear Regression
Kefei Liu, Jieping Ye, Yang Yang, Li Shen, Hui Jiang

TL;DR
This paper introduces a unified statistical model that simultaneously normalizes RNA-seq data and detects differential gene expression, improving accuracy over traditional methods by jointly estimating normalization factors and expression differences.
Contribution
It proposes a joint normalization and differential expression detection model using L1-penalized linear regression, addressing limitations of existing ad hoc normalization approaches.
Findings
Outperforms existing methods in detection power
Reduces false-positive rates in differential expression analysis
Effective when many genes are differentially expressed
Abstract
The RNA-sequencing (RNA-seq) is becoming increasingly popular for quantifying gene expression levels. Since the RNA-seq measurements are relative in nature, between-sample normalization of counts is an essential step in differential expression (DE) analysis. The normalization of existing DE detection algorithms is ad hoc and performed once for all prior to DE detection, which may be suboptimal since ideally normalization should be based on non-DE genes only and thus coupled with DE detection. We propose a unified statistical model for joint normalization and DE detection of log-transformed RNA-seq data. Sample-specific normalization factors are modeled as unknown parameters in the gene-wise linear models and jointly estimated with the regression coefficients. By imposing sparsity-inducing L1 penalty (or mixed L1/L2-norm for multiple treatment conditions) on the regression coefficients,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Genetic and phenotypic traits in livestock · Statistical Methods and Inference
