Low-bandwidth and non-compute intensive remote identification of   microbes from raw sequencing reads

Laurent Gautier; Ole Lund

arXiv:1306.1569·q-bio.GN·March 5, 2014

Low-bandwidth and non-compute intensive remote identification of microbes from raw sequencing reads

Laurent Gautier, Ole Lund

PDF

TL;DR

This paper introduces a low-bandwidth, non-compute intensive system for identifying microbes from raw sequencing reads by querying a remote server with minimal data transfer, enabling efficient organism identification.

Contribution

It presents a novel approach that allows microbial identification without prior reference genome specification, using a distributed architecture for minimal data transfer and computation.

Findings

01

System can identify microbes with minimal data transfer

02

Implemented web server indexing thousands of genomes

03

Client can run in a web browser on modest devices

Abstract

Cheap high-throughput DNA sequencing may soon become routine not only for human genomes but also for practically anything requiring the identification of living organisms from their DNA: tracking of infectious agents, control of food products, bioreactors, or environmental samples. We propose a novel general approach to the analysis of sequencing data in which the reference genome does not have to be specified. Using a distributed architecture we are able to query a remote server for hints about what the reference might be, transferring a relatively small amount of data, and the hints can be used for more computationally-demanding work. Our system consists of a server with known reference DNA indexed, and a client with raw sequencing reads. The client sends a sample of unidentified reads, and in return receives a list of matching references known to the server. Sequences for the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.