Mixture model with multiple allocations for clustering spatially   correlated observations in the analysis of ChIP-Seq data

Saverio Ranciati; Cinzia Viroli; Ernst Wit

arXiv:1601.04879·stat.AP·May 13, 2016

Mixture model with multiple allocations for clustering spatially correlated observations in the analysis of ChIP-Seq data

Saverio Ranciati, Cinzia Viroli, Ernst Wit

PDF

Open Access

TL;DR

This paper introduces a novel mixture model for clustering ChIP-Seq data that allows for multiple group memberships per observation, incorporates zero-inflation correction, and accounts for spatial dependencies, improving analysis of complex biological signals.

Contribution

It proposes a new mixture model with multiple allocations, zero-inflation correction, and spatial dependency modeling specifically designed for ChIP-Seq data analysis.

Findings

01

Model effectively captures multiple group memberships.

02

Incorporates zero-inflation to handle excess zeros.

03

Demonstrates improved clustering performance on real data.

Abstract

Model-based clustering is a technique widely used to group a collection of units into mutually exclusive groups. There are, however, situations in which an observation could in principle belong to more than one cluster. In the context of Next-Generation Sequencing (NGS) experiments, for example, the signal observed in the data might be produced by two (or more) different biological processes operating together and a gene could participate in both (or all) of them. We propose a novel approach to cluster NGS discrete data, coming from a ChIP-Seq experiment, with a mixture model, allowing each unit to belong potentially to more than one group: these multiple allocation clusters can be flexibly defined via a function combining the features of the original groups without introducing new parameters. The formulation naturally gives rise to a `zero-inflation group' in which values close to zero…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Gene expression and cancer classification · Bioinformatics and Genomic Networks