Systematic clustering algorithm for chromatin accessibility data and its   application to hematopoietic cells

Azusa Tanaka; Yasuhiro Ishitsuka; Hiroki Ohta; Akihiro Fujimoto,; Jun-ichirou Yasunaga; Masao Matsuoka

arXiv:1912.10641·q-bio.GN·October 14, 2024·PLoS Comput. Biol.

Systematic clustering algorithm for chromatin accessibility data and its application to hematopoietic cells

Azusa Tanaka, Yasuhiro Ishitsuka, Hiroki Ohta, Akihiro Fujimoto,, Jun-ichirou Yasunaga, Masao Matsuoka

PDF

TL;DR

This paper introduces a systematic clustering algorithm tailored for chromatin accessibility data, utilizing a novel data reduction method based on genome string representations and Hamming distances, to classify hematopoietic cell types and explore leukemia.

Contribution

The paper presents a new clustering algorithm that employs a systematic peak selection and genome string representation for analyzing chromatin accessibility data.

Findings

01

Effective classification of hematopoietic cell types

02

Quantitative evaluation of sample differences

03

Potential insights into leukemia pathogenesis

Abstract

The huge amount of data acquired by high-throughput sequencing requires data reduction for effective analysis. Here we give a clustering algorithm for genome-wide open chromatin data using a new data reduction method. This method regards the genome as a string of $1$ s and $0$ s based on a set of peaks and calculates the Hamming distances between the strings. This algorithm with the systematically optimized set of peaks enables us to quantitatively evaluate differences between samples of hematopoietic cells and classify cell types, potentially leading to a better understanding of leukemia pathogenesis.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.