# Extreme-Scale De Novo Genome Assembly

**Authors:** Evangelos Georganas, Steven Hofmeyr, Rob Egan, Aydin Buluc, Leonid, Oliker, Daniel Rokhsar, Katherine Yelick

arXiv: 1705.11147 · 2017-06-01

## TL;DR

This paper introduces HipMER, a scalable de novo genome assembler optimized for supercomputers, capable of assembling complex genomes like human and wheat efficiently at extreme scale.

## Contribution

It presents a novel parallelization of the Meraculous assembler, addressing computational challenges and demonstrating high-performance assembly on large-scale systems.

## Key findings

- Successfully assembled human genome on large supercomputers
- Achieved efficient parallelization for complex genome assembly
- Demonstrated scalability up to tens of thousands of cores

## Abstract

De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and the large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.11147/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1705.11147/full.md

## References

49 references — full list in the complete paper: https://tomesphere.com/paper/1705.11147/full.md

---
Source: https://tomesphere.com/paper/1705.11147