# Biclustering analysis on tree-shaped time-series single cell gene expression data of Caenorhabditis elegans

**Authors:** Qi Guan, Xianzhong Yan, Yida Wu, Da Zhou, Jie Hu

PMC · DOI: 10.1186/s12859-024-05800-y · BMC Bioinformatics · 2024-05-09

## TL;DR

This paper introduces a new biclustering method for analyzing gene expression data in C. elegans embryos, improving biological insights.

## Contribution

A novel biclustering model for tree-shaped time-series single-cell gene expression data in C. elegans is proposed.

## Key findings

- The model outperforms classical biclustering methods on small-scale datasets.
- Gene enrichment analysis confirms the biological relevance of the biclustering results.

## Abstract

In recent years, gene clustering analysis has become a widely used tool for studying gene functions, efficiently categorizing genes with similar expression patterns to aid in identifying gene functions. Caenorhabditis elegans is commonly used in embryonic research due to its consistent cell lineage from fertilized egg to adulthood. Biologists use 4D confocal imaging to observe gene expression dynamics at the single-cell level. However, on one hand, the observed tree-shaped time-series datasets have characteristics such as non-pairwise data points between different individuals. On the other hand, the influence of cell type heterogeneity should also be considered during clustering, aiming to obtain more biologically significant clustering results.

A biclustering model is proposed for tree-shaped single-cell gene expression data of Caenorhabditis elegans. Detailedly, a tree-shaped piecewise polynomial function is first employed to fit non-pairwise gene expression time series data. Then, four factors are considered in the objective function, including Pearson correlation coefficients capturing gene correlations, p-values from the Kolmogorov-Smirnov test measuring the similarity between cells, as well as gene expression size and bicluster overlapping size. After that, Genetic Algorithm is utilized to optimize the function.

The results on the small-scale dataset analysis validate the feasibility and effectiveness of our model and are superior to existing classical biclustering models. Besides, gene enrichment analysis is employed to assess the results on the complete real dataset analysis, confirming that the discovered biclustering results hold significant biological relevance.

The online version contains supplementary material available at 10.1186/s12859-024-05800-y.

## Linked entities

- **Species:** Caenorhabditis elegans (taxon 6239)

## Full-text entities

- **Species:** Caenorhabditis elegans (species) [taxon 6239]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11080145/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11080145/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/PMC11080145/full.md

---
Source: https://tomesphere.com/paper/PMC11080145