Searching and Indexing Genomic Databases via Kernelization

Travis Gagie; Simon J. Puglisi

arXiv:1412.1591·cs.DS·December 5, 2014

Searching and Indexing Genomic Databases via Kernelization

Travis Gagie, Simon J. Puglisi

PDF

Open Access

TL;DR

This paper reviews the evolution of methods for efficient genomic database search and indexing by leveraging genome similarities, connecting these approaches to kernelization in parameterized complexity.

Contribution

It provides a comprehensive survey of twenty years of research on genome indexing techniques based on similarity and relates these methods to kernelization theory.

Findings

01

Historical overview of genome indexing methods

02

Connection between genome similarity techniques and kernelization

03

Insights into the evolution of efficient search algorithms

Abstract

The rapid advance of DNA sequencing technologies has yielded databases of thousands of genomes. To search and index these databases effectively, it is important that we take advantage of the similarity between those genomes. Several authors have recently suggested searching or indexing only one reference genome and the parts of the other genomes where they differ. In this paper we survey the twenty-year history of this idea and discuss its relation to kernelization in parameterized complexity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Genome Rearrangement Algorithms · Genomics and Phylogenetic Studies