Will solid-state drives accelerate your bioinformatics? In-depth profiling, performance analysis, and beyond
Sungmin Lee, Hyeyoung Min, and Sungroh Yoon

TL;DR
This paper investigates how solid-state drives (SSDs) can accelerate bioinformatics workflows by profiling 23 key programs, analyzing performance gains, and discussing optimization strategies for integrating SSDs with parallel computing.
Contribution
It provides an in-depth profiling and analysis of bioinformatics programs on SSDs, offering insights and recommendations for optimizing bioinformatics pipelines with new storage technologies.
Findings
SSDs can significantly speed up bioinformatics programs.
Parallelization combined with SSDs enhances performance.
Profiling reveals specific bottlenecks and optimization opportunities.
Abstract
A wide variety of large-scale data has been produced in bioinformatics. In response, the need for efficient handling of biomedical big data has been partly met by parallel computing. However, the time demand of many bioinformatics programs still remains high for large-scale practical uses due to factors that hinder acceleration by parallelization. Recently, new generations of storage devices have emerged, such as NAND flash-based solid-state drives (SSDs), and with the renewed interest in near-data processing, they are increasingly becoming acceleration methods that can accompany parallel processing. In certain cases, a simple drop-in replacement of hard disk drives (HDDs) by SSDs results in dramatic speedup. Despite the various advantages and continuous cost reduction of SSDs, there has been little review of SSD-based profiling and performance exploration of important but…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Algorithms and Data Compression · Caching and Content Delivery
