Improving data interpretability with new differential sample variance gene set tests
Yasir Rahmatallah, Galina Glazko

TL;DR
This paper introduces new gene set analysis methods to detect differences in sample variance, improving biological interpretations from gene expression data.
Contribution
The paper proposes multivariate gene set analysis methods using minimum spanning tree ranking to detect differential sample variance.
Findings
Methods detecting differential sample variance identify meaningful biological pathways in heterogeneous datasets.
The new methods were successfully applied to leukemia and polyp gene expression datasets with distinct subtypes.
Software implementation is available in the Bioconductor package GSAR for use with normalized omics data.
Abstract
Gene set analysis methods have played a major role in generating biological interpretations from omics data such as gene expression datasets. However, most methods focus on detecting homogenous pattern changes in mean expression and methods detecting pattern changes in variance remain poorly explored. While a few studies attempted to use gene-level variance analysis, such approach remains under-utilized. When comparing two phenotypes, gene sets with distinct changes in subgroups under one phenotype are overlooked by available methods although they reflect meaningful biological differences between two phenotypes. Multivariate sample-level variance analysis methods are needed to detect such pattern changes. We use ranking schemes based on minimum spanning tree to generalize the Cramer-Von Mises and Anderson-Darling univariate statistics into multivariate gene set analysis methods to…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Gene Regulatory Network Analysis
