
TL;DR
This paper introduces the Hierarchical Overlap Graph (HOG), a space-efficient alternative to the traditional Overlap Graph for assembling DNA fragments, with a linear space construction algorithm for equal-length words.
Contribution
It presents the first linear-space algorithm for constructing the Hierarchical Overlap Graph from equal-length words, improving scalability.
Findings
HOG encodes all maximal overlaps using linear space.
The algorithm efficiently constructs HOG for equal-length words.
HOG enhances DNA assembly and shortest superstring computations.
Abstract
Given a set of finite words, the Overlap Graph (OG) is a complete weighted digraph where each word is a node and where the weight of an arc equals the length of the longest overlap of one word onto the other (Overlap is an asymmetric notion). The OG serves to assemble DNA fragments or to compute shortest superstrings which are a compressed representation of the input. The OG requires a space is quadratic in the number of words, which limits its scalability. The Hierarchical Overlap Graph (HOG) is an alternative graph that also encodes all maximal overlaps, but uses a space that is linear in the sum of the lengths of the input words. We propose the first algorithm to build the HOG in linear space for words of equal length.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
