Estimation of the methylation pattern distribution from deep sequencing data
Peijie Lin, Sylvain Foret, Susan R. Wilson, Conrad J. Burden

TL;DR
This paper introduces a statistical model and an R package for accurately estimating the distribution of DNA methylation patterns from bisulphite sequencing data, accounting for errors and conversion issues.
Contribution
It presents a novel statistical approach and software tool for quantifying methylation pattern diversity from sequencing data, improving accuracy over previous methods.
Findings
Model accounts for sequencing errors and conversion issues
Provides reliable estimation of methylation pattern distribution
Implemented in the MPFE R package
Abstract
Motivation: Bisulphite sequencing enables the detection of cytosine methylation. The sequence of the methylation states of cytosines on any given read forms a methylation pattern that carries substantially more information than merely studying the average methylation level at individual positions. In order to understand better the complexity of DNA methylation landscapes in biological samples, it is important to study the diversity of these methylation patterns. However, the accurate quantification of methylation patterns is subject to sequencing errors and spurious signals due to incomplete bisulphite conversion of cytosines. Results: A statistical model is developed which accounts for the distribution of DNA methylation patterns at any given locus. The model incorporates the effects of sequencing errors and spurious reads, and enables estimation of the true underlying distribution of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEpigenetics and DNA Methylation · RNA modifications and cancer · Genomics and Chromatin Dynamics
