GRACKLE: an interpretable matrix factorization approach for biomedical representation learning
Lucas A Gillenwater, Lawrence E Hunter, James C Costello

TL;DR
GRACKLE is a new method that uses biological knowledge to improve gene pattern discovery in diseases, especially when data is limited.
Contribution
GRACKLE introduces a joint integration of sample similarity and gene similarity with prior biological knowledge in matrix factorization.
Findings
GRACKLE outperformed other NMF algorithms in simulations with increased background noise.
GRACKLE identified condition-enriched subgroups in breast tumors and Down syndrome samples.
Latent representations aligned with known biological patterns like autoimmune conditions and sleep apnea.
Abstract
Disruption in normal gene expression can contribute to the development of diseases and chronic conditions. However, identifying disease-specific gene signatures can be challenging due to the presence of multiple co-occurring conditions and limited sample sizes. Unsupervised representation learning methods, such as matrix decomposition and deep learning, simplify high-dimensional data into understandable patterns, but often do not provide clear biological explanations. Incorporating prior biological knowledge directly can enhance understanding and address small sample sizes. Nevertheless, current models do not jointly consider prior knowledge of molecular interactions and sample labels. We present GRACKLE, a novel nonnegative matrix factorization approach that applies Graph Regularization Across Contextual KnowLedgE. GRACKLE integrates sample similarity and gene similarity matrices…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Single-cell and spatial transcriptomics
