Gene set proximity analysis: expanding gene set enrichment analysis through learned geometric embeddings
Henry Cousins, Taryn Hall, Yinglong Guo, Luke Tso, Kathy Tzy-Hwa, Tzeng, Le Cong, Russ Altman

TL;DR
Gene set proximity analysis (GSPA) extends traditional gene set enrichment analysis by embedding gene interactions into a learned geometric space, enhancing pathway detection and reproducibility, and identifying novel drug associations for COVID-19.
Contribution
The paper introduces GSPA, a novel method that captures complex gene interactions in a latent space, improving pathway analysis and drug discovery over existing approaches.
Findings
GSPA outperforms traditional methods in identifying disease pathways.
GSPA reveals novel drug associations with SARS-CoV-2.
Retrospective analysis supports gabapentin as a risk factor and metformin as protective for COVID-19.
Abstract
Gene set analysis methods rely on knowledge-based representations of genetic interactions in the form of both gene set collections and protein-protein interaction (PPI) networks. Explicit representations of genetic interactions often fail to capture complex interdependencies among genes, limiting the analytic power of such methods. Here we propose an extension of gene set enrichment analysis to a latent feature space reflecting PPI network topology, called gene set proximity analysis (GSPA). Compared with existing methods, GSPA provides improved ability to identify disease-associated pathways in disease-matched gene expression datasets, while improving reproducibility of enrichment statistics for similar gene sets. GSPA is statistically straightforward, reducing to classical gene set enrichment through a single user-defined parameter. We apply our method to identify novel drug…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Computational Drug Discovery Methods · Gene expression and cancer classification
