Enrichment on steps, not genes, improves inference of differentially expressed pathways
Nicholas Markarian, Kimberly M. Van Auken, Dustin Ebert, Paul W. Sternberg

TL;DR
This paper improves pathway analysis by focusing on steps rather than individual genes, revealing more accurate insights into differentially expressed pathways.
Contribution
The novel approach treats sets of interchangeable genes as single entities to improve pathway enrichment analysis.
Findings
Treating gene sets as single entities increases sensitivity to pathways with OR logic.
The method recovers pathways missed by traditional gene list enrichment analysis.
Results show significant proportions of new pathways in medically relevant datasets.
Abstract
Enrichment analysis is frequently used in combination with differential expression data to investigate potential commonalities amongst lists of genes and generate hypotheses for further experiments. However, current enrichment analysis approaches on pathways ignore the functional relationships between genes in a pathway, particularly OR logic that occurs when a set of proteins can each individually perform the same step in a pathway. As a result, these approaches miss pathways with large or multiple sets because of an inflation of pathway size (when measured as the total gene count) relative to the number of steps. We address this problem by enriching on step-enabling entities in pathways. We treat sets of protein-coding genes as single entities, and we also weight sets to account for the number of genes in them using the multivariate Fisher’s noncentral hypergeometric distribution. We…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Gene expression and cancer classification · Biomedical Text Mining and Ontologies
