Designing robust watermark barcodes for multiplex long-read sequencing
Joaqu\'in Ezpeleta, Flavia J. Krsticevic, Pilar Bulacio, Elizabeth, Tapia

TL;DR
This paper introduces a novel method for designing robust watermark barcodes that can tolerate high error rates in long-read sequencing, enabling accurate multiplexing of thousands of samples without upstream quality improvements.
Contribution
The authors present the first barcode design method specifically addressing high-error long-read sequencing, with software tools and example sets provided.
Findings
Barcodes achieve sample misassignment probabilities as low as 10^{-7}
Method tolerates error rates around 11% in long-read sequencing
Software tools are freely available for barcode construction and demultiplexing
Abstract
A method for designing sequencing barcodes that can withstand a large number of insertion, deletion and substitution errors and are suitable for use in multiplex single-molecule real-time sequencing is presented. The manuscript focuses on the design of barcodes for full-length single-pass reads, impaired by challenging error rates in the order of 11%. To the authors' knowledge, this is the first method to specifically address this problem without requiring upstream quality improvement. The proposed barcodes can multiplex hundreds or thousands of samples while achieving sample misassignment probabilities as low as , and are designed to be compatible with chemical constraints imposed by the sequencing process. Software for constructing watermark barcode sets and demultiplexing barcoded reads, together with example sets of barcodes and synthetic barcoded reads, are freely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDNA and Biological Computing · Algorithms and Data Compression · QR Code Applications and Technologies
