PeakSegJoint: fast supervised peak detection via joint segmentation of   multiple count data samples

Toby Dylan Hocking; Guillaume Bourque

arXiv:1506.01286·stat.ML·June 4, 2015

PeakSegJoint: fast supervised peak detection via joint segmentation of multiple count data samples

Toby Dylan Hocking, Guillaume Bourque

PDF

Open Access 1 Repo

TL;DR

PeakSegJoint is a supervised, fast, and interpretable peak detection method for multiple genomic samples, capable of handling any number of sample types with overlapping peaks, outperforming existing algorithms in speed and interpretability.

Contribution

It introduces a novel constrained maximum likelihood segmentation model for multiple samples and a supervised penalty learning approach for peak number selection.

Findings

01

Achieves similar accuracy to state-of-the-art methods

02

Operates faster than existing algorithms

03

Provides more interpretable overlapping peak models

Abstract

Joint peak detection is a central problem when comparing samples in genomic data analysis, but current algorithms for this task are unsupervised and limited to at most 2 sample types. We propose PeakSegJoint, a new constrained maximum likelihood segmentation model for any number of sample types. To select the number of peaks in the segmentation, we propose a supervised penalty learning model. To infer the parameters of these two models, we propose to use a discrete optimization heuristic for the segmentation, and convex optimization for the penalty learning. In comparisons with state-of-the-art peak detection algorithms, PeakSegJoint achieves similar accuracy, faster speeds, and a more interpretable model with overlapping peaks that occur in exactly the same positions across all samples.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tdhock/PeakSegJoint
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene expression and cancer classification · Genomics and Chromatin Dynamics · Genomics and Phylogenetic Studies