EXMA: A Genomics Accelerator for Exact-Matching

Lei Jiang; Farzaneh Zokaee

arXiv:2101.05314·cs.AR·January 15, 2021

EXMA: A Genomics Accelerator for Exact-Matching

Lei Jiang, Farzaneh Zokaee

PDF

Open Access

TL;DR

EXMA is a novel hardware accelerator that significantly improves the throughput and energy efficiency of FM-Index based exact-match operations in genomics by processing multiple DNA symbols per DRAM activation.

Contribution

The paper introduces EXMA, a new accelerator with a multi-task-learning based index and advanced memory management techniques, achieving substantial performance gains over prior FM-Index accelerators.

Findings

01

EXMA achieves 4.9x higher search throughput than state-of-the-art PIMs.

02

EXMA improves throughput per Watt by 4.8x.

03

The proposed techniques enhance memory utilization and reduce data structure size.

Abstract

Genomics is the foundation of precision medicine, global food security and virus surveillance. Exact-match is one of the most essential operations widely used in almost every step of genomics such as alignment, assembly, annotation, and compression. Modern genomics adopts Ferragina-Manzini Index (FM-Index) augmenting space-efficient Burrows-Wheeler transform (BWT) with additional data structures to permit ultra-fast exact-match operations. However, FM-Index is notorious for its poor spatial locality and random memory access pattern. Prior works create GPU-, FPGA-, ASIC- and even process-in-memory (PIM)-based accelerators to boost FM-Index search throughput. Though they achieve the state-of-the-art FM-Index search throughput, the same as all prior conventional accelerators, FM-Index PIMs process only one DNA symbol after each DRAM row activation, thereby suffering from poor memory…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Error Correcting Code Techniques · DNA and Biological Computing