A Bayesian hierarchical hidden Markov model for clustering and gene selection: Application to kidney cancer gene expression data

Thierry Chekouo; Himadri Mukherjee

PMC · DOI:10.1002/bimj.202300173·July 12, 2024

A Bayesian hierarchical hidden Markov model for clustering and gene selection: Application to kidney cancer gene expression data

Thierry Chekouo, Himadri Mukherjee

TL;DR

This paper introduces a Bayesian model using hidden Markov structures to cluster genes and identify relevant features in kidney cancer data.

Contribution

The novelty lies in integrating gene ontology knowledge with HMMs for biclustering and gene selection.

Findings

01

The model identifies clusters of samples with overexpressed, underexpressed, and irrelevant genes.

02

The method was validated using simulated data and The Cancer Genome Atlas kidney cancer dataset.

03

An R package was developed to implement the proposed Bayesian approach.

Abstract

We introduce a Bayesian approach for biclustering that accounts for the prior functional dependence between genes using hidden Markov models (HMMs). We utilize biological knowledge gathered from gene ontologies and the hidden Markov structure to capture the potential coexpression of neighboring genes. Our interpretable model-based clustering characterized each cluster of samples by three groups of features: overexpressed, underexpressed, and irrelevant features. The proposed methods have been implemented in an R package and are used to analyze both the simulated data and The Cancer Genome Atlas kidney cancer data.

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Diseases2

kidney cancer Cancer

Figures7

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Bayesian Modeling and Causal Inference