A Graphical Method for Identifying Gene Clusters from RNA Sequencing Data
Jake R. Patock, Rinki Ratnapriya, Arko Barman

TL;DR
This paper introduces a graph-based, jointly optimized method for identifying gene clusters from RNA-Seq data, enhancing robustness and applicability for understanding disease mechanisms like AMD.
Contribution
It presents a novel integrated approach combining gene co-expression networks, Node2Vec+ embeddings, spectral clustering, and hyperparameter optimization for stable gene cluster identification.
Findings
Method produces consistent, robust gene clusters.
Applicable to various RNA-Seq datasets and diseases.
Validated with AMD dataset showing significant results.
Abstract
The identification of disease-gene associations is instrumental in understanding the mechanisms of diseases and developing novel treatments. Besides identifying genes from RNA-Seq datasets, it is often necessary to identify gene clusters that have relationships with a disease. In this work, we propose a graph-based method for using an RNA-Seq dataset with known genes related to a disease and perform a robust clustering analysis to identify clusters of genes. Our method involves the construction of a gene co-expression network, followed by the computation of gene embeddings leveraging Node2Vec+, an algorithm applying weighted biased random walks and skipgram with negative sampling to compute node embeddings from undirected graphs with weighted edges. Finally, we perform spectral clustering to identify clusters of genes. All processes in our entire method are jointly optimized for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Single-cell and spatial transcriptomics · Gene expression and cancer classification
