Conditional Hierarchical Bayesian Tucker Decomposition for Genetic Data Analysis
Adam Sandler, Diego Klabjan, Yuan Luo

TL;DR
This paper introduces a novel hierarchical Bayesian Tucker decomposition method tailored for analyzing large, sparse genetic datasets, enabling the discovery of meaningful gene and pathway groupings associated with various cancers and autism.
Contribution
It extends latent Dirichlet allocation to multiple dimensions and develops hierarchical modeling techniques, improving coherence over baseline models in genetic data analysis.
Findings
Models are more coherent than baseline approaches
Successfully identified gene and pathway groups linked to cancer and autism
Enhanced understanding of genetic risk factors
Abstract
We analyze large, multi-dimensional, sparse counting data sets, finding unsupervised groups to provide unique insights into genetic data. We create gene and biological pathway groups based on patients' variants to find common risk factors for four common types of cancer (breast, lung, prostate, and colorectal) and autism spectrum disorder. To accomplish this, we extend latent Dirichlet allocation to multiple dimensions and design distinct methods for hierarchical topic modeling. We find that our conditional hierarchical Bayesian Tucker decomposition models are more coherent than baseline models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Epigenetics and DNA Methylation
MethodsLogistic Regression
