Segmentation and genome annotation algorithms
Maxwell W Libbrecht, Rachel CW Chan, Michael M Hoffman

TL;DR
Segmentation and genome annotation (SAGA) algorithms analyze epigenomic data to segment the genome and identify functional elements like promoters and enhancers, aiding understanding of gene regulation.
Contribution
This paper reviews the common framework, variants, and improvements of SAGA algorithms, and catalogs existing large-scale annotations and future directions.
Findings
SAGA algorithms segment the genome based on epigenomic patterns.
They identify functional categories such as promoters and enhancers.
The paper discusses methodological advances and future prospects.
Abstract
Segmentation and genome annotation (SAGA) algorithms are widely used to understand genome activity and gene regulation. These algorithms take as input epigenomic datasets, such as chromatin immunoprecipitation-sequencing (ChIP-seq) measurements of histone modifications or transcription factor binding. They partition the genome and assign a label to each segment such that positions with the same label exhibit similar patterns of input data. SAGA algorithms discover categories of activity such as promoters, enhancers, or parts of genes without prior knowledge of known genomic elements. In this sense, they generally act in an unsupervised fashion like clustering algorithms, but with the additional simultaneous function of segmenting the genome. Here, we review the common methodological framework that underlies these methods, review variants of and improvements upon this basic framework,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Chromatin Dynamics · Genomics and Phylogenetic Studies · Machine Learning in Bioinformatics
MethodsSAGA
