Segmentation and genome annotation algorithms

Maxwell W Libbrecht; Rachel CW Chan; Michael M Hoffman

arXiv:2101.00688·q-bio.GN·June 14, 2022·1 cites

Segmentation and genome annotation algorithms

Maxwell W Libbrecht, Rachel CW Chan, Michael M Hoffman

PDF

Open Access

TL;DR

Segmentation and genome annotation (SAGA) algorithms analyze epigenomic data to segment the genome and identify functional elements like promoters and enhancers, aiding understanding of gene regulation.

Contribution

This paper reviews the common framework, variants, and improvements of SAGA algorithms, and catalogs existing large-scale annotations and future directions.

Findings

01

SAGA algorithms segment the genome based on epigenomic patterns.

02

They identify functional categories such as promoters and enhancers.

03

The paper discusses methodological advances and future prospects.

Abstract

Segmentation and genome annotation (SAGA) algorithms are widely used to understand genome activity and gene regulation. These algorithms take as input epigenomic datasets, such as chromatin immunoprecipitation-sequencing (ChIP-seq) measurements of histone modifications or transcription factor binding. They partition the genome and assign a label to each segment such that positions with the same label exhibit similar patterns of input data. SAGA algorithms discover categories of activity such as promoters, enhancers, or parts of genes without prior knowledge of known genomic elements. In this sense, they generally act in an unsupervised fashion like clustering algorithms, but with the additional simultaneous function of segmenting the genome. Here, we review the common methodological framework that underlies these methods, review variants of and improvements upon this basic framework,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenomics and Chromatin Dynamics · Genomics and Phylogenetic Studies · Machine Learning in Bioinformatics

MethodsSAGA