Conditional Hierarchical Bayesian Tucker Decomposition for Genetic Data Analysis

Adam Sandler; Diego Klabjan; Yuan Luo

arXiv:1911.12426·cs.LG·September 1, 2025

Conditional Hierarchical Bayesian Tucker Decomposition for Genetic Data Analysis

Adam Sandler, Diego Klabjan, Yuan Luo

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel hierarchical Bayesian Tucker decomposition method tailored for analyzing large, sparse genetic datasets, enabling the discovery of meaningful gene and pathway groupings associated with various cancers and autism.

Contribution

It extends latent Dirichlet allocation to multiple dimensions and develops hierarchical modeling techniques, improving coherence over baseline models in genetic data analysis.

Findings

01

Models are more coherent than baseline approaches

02

Successfully identified gene and pathway groups linked to cancer and autism

03

Enhanced understanding of genetic risk factors

Abstract

We analyze large, multi-dimensional, sparse counting data sets, finding unsupervised groups to provide unique insights into genetic data. We create gene and biological pathway groups based on patients' variants to find common risk factors for four common types of cancer (breast, lung, prostate, and colorectal) and autism spectrum disorder. To accomplish this, we extend latent Dirichlet allocation to multiple dimensions and design distinct methods for hierarchical topic modeling. We find that our conditional hierarchical Bayesian Tucker decomposition models are more coherent than baseline models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ars2240/asdHBTucker
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Epigenetics and DNA Methylation

MethodsLogistic Regression