Mixture model with multiple allocations for clustering spatially correlated observations in the analysis of ChIP-Seq data
Saverio Ranciati, Cinzia Viroli, Ernst Wit

TL;DR
This paper introduces a novel mixture model for clustering ChIP-Seq data that allows for multiple group memberships per observation, incorporates zero-inflation correction, and accounts for spatial dependencies, improving analysis of complex biological signals.
Contribution
It proposes a new mixture model with multiple allocations, zero-inflation correction, and spatial dependency modeling specifically designed for ChIP-Seq data analysis.
Findings
Model effectively captures multiple group memberships.
Incorporates zero-inflation to handle excess zeros.
Demonstrates improved clustering performance on real data.
Abstract
Model-based clustering is a technique widely used to group a collection of units into mutually exclusive groups. There are, however, situations in which an observation could in principle belong to more than one cluster. In the context of Next-Generation Sequencing (NGS) experiments, for example, the signal observed in the data might be produced by two (or more) different biological processes operating together and a gene could participate in both (or all) of them. We propose a novel approach to cluster NGS discrete data, coming from a ChIP-Seq experiment, with a mixture model, allowing each unit to belong potentially to more than one group: these multiple allocation clusters can be flexibly defined via a function combining the features of the original groups without introducing new parameters. The formulation naturally gives rise to a `zero-inflation group' in which values close to zero…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Gene expression and cancer classification · Bioinformatics and Genomic Networks
