Big Data Challenges in Genome Informatics

Ka-Chun Wong

arXiv:1803.09632·q-bio.OT·March 28, 2018

Big Data Challenges in Genome Informatics

Ka-Chun Wong

PDF

TL;DR

The paper discusses the significant data management challenges posed by the rapid growth of genomic data due to advancements in sequencing technologies, emphasizing the need for new computational strategies.

Contribution

It provides a concise overview of the big data challenges in genomics, highlighting the scale and complexity of modern genomic datasets.

Findings

01

Genomic data volumes are increasing exponentially.

02

Current data storage and processing methods face scalability issues.

03

New computational approaches are needed to handle big genomic data.

Abstract

In recent years, we have witnessed a dramatic data explosion in genomics, thanks to the improvement in sequencing technologies and the drastically decreasing costs. We are entering the era of millions of available genomes. Notably, each genome can be composed of billions of nucleotides stored as plain text files in GigaBytes (GBs). It is undeniable that those genome data impose unprecedented data challenges for us. In this article, we briefly discuss the big data challenges associated with genomics in recent years.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.