# A Bayesian hierarchical hidden Markov model for clustering and gene selection: Application to kidney cancer gene expression data

**Authors:** Thierry Chekouo, Himadri Mukherjee

PMC · DOI: 10.1002/bimj.202300173 · 2024-07-12

## TL;DR

This paper introduces a Bayesian model using hidden Markov structures to cluster genes and identify relevant features in kidney cancer data.

## Contribution

The novelty lies in integrating gene ontology knowledge with HMMs for biclustering and gene selection.

## Key findings

- The model identifies clusters of samples with overexpressed, underexpressed, and irrelevant genes.
- The method was validated using simulated data and The Cancer Genome Atlas kidney cancer dataset.
- An R package was developed to implement the proposed Bayesian approach.

## Abstract

We introduce a Bayesian approach for biclustering that accounts for the prior functional dependence between genes using hidden Markov models (HMMs). We utilize biological knowledge gathered from gene ontologies and the hidden Markov structure to capture the potential coexpression of neighboring genes. Our interpretable model-based clustering characterized each cluster of samples by three groups of features: overexpressed, underexpressed, and irrelevant features. The proposed methods have been implemented in an R package and are used to analyze both the simulated data and The Cancer Genome Atlas kidney cancer data.

## Linked entities

- **Diseases:** kidney cancer (MONDO:0002367)

## Full-text entities

- **Diseases:** Cancer (MESH:D009369), kidney cancer (MESH:D007680)

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11239327/full.md

---
Source: https://tomesphere.com/paper/PMC11239327