Accurate Decoding of Pooled Sequenced Data Using Compressed Sensing
Denisa Duma, Mary Wootters, Anna C. Gilbert, Hung Q. Ngo, Atri Rudra,, Matthew Alpert, Timothy J. Close, Gianfranco Ciardo, and Stefano Lonardi

TL;DR
This paper introduces a novel decoding algorithm that combines compressed sensing and error-correcting codes to improve the quality of genome assemblies from pooled sequencing data, overcoming limitations of DNA barcoding.
Contribution
The paper presents a new decoding algorithm for pooled sequencing data that significantly enhances assembly quality, leveraging combinatorial pooling, compressed sensing, and error-correcting codes.
Findings
Higher quality genome assemblies achieved
Effective decoding of pooled sequencing data demonstrated
Applicable to large, repetitive genomes
Abstract
In order to overcome the limitations imposed by DNA barcoding when multiplexing a large number of samples in the current generation of high-throughput sequencing instruments, we have recently proposed a new protocol that leverages advances in combinatorial pooling design (group testing) doi:10.1371/journal.pcbi.1003010. We have also demonstrated how this new protocol would enable de novo selective sequencing and assembly of large, highly-repetitive genomes. Here we address the problem of decoding pooled sequenced data obtained from such a protocol. Our algorithm employs a synergistic combination of ideas from compressed sensing and the decoding of error-correcting codes. Experimental results on synthetic data for the rice genome and real data for the barley genome show that our novel decoding algorithm enables significantly higher quality assemblies than the previous approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced biosensing and bioanalysis techniques · Genomic variations and chromosomal abnormalities · DNA and Biological Computing
