Length distribution of sequencing by synthesis: fixed flow cycle model
Yong Kong

TL;DR
This paper introduces a fixed flow cycle model for sequencing by synthesis, providing a probabilistic distribution of sequence lengths under realistic conditions with incomplete nucleotide incorporation, enhancing understanding of sequencing outcomes.
Contribution
The paper presents a new fixed flow cycle model that derives the sequence length distribution in sequencing by synthesis, accommodating probabilistic and incomplete nucleotide incorporation.
Findings
Derived explicit formulas for mean and variance of sequence length distribution.
Provided probability distribution for sequence length under various sequencing conditions.
Enhanced modeling accuracy for next-generation sequencing technologies.
Abstract
Sequencing by synthesis is the underlying technology for many next-generation DNA sequencing platforms. We developed a new model, the fixed flow cycle model, to derive the distributions of sequence length for a given number of flow cycles under the general conditions where the nucleotide incorporation is probabilistic and may be incomplete, as in some single-molecule sequencing technologies. Unlike the previous model, the new model yields the probability distribution for the sequence length. Explicit closed form formulas are derived for the mean and variance of the distribution.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
