Alternative Algorithms for Lyndon Factorization

Sukhpal Singh Ghuman; Emanuele Giaquinta; Jorma Tarhio

arXiv:1405.4892·cs.DS·July 14, 2014·1 cites

Alternative Algorithms for Lyndon Factorization

Sukhpal Singh Ghuman, Emanuele Giaquinta, Jorma Tarhio

PDF

Open Access

TL;DR

This paper introduces two improved algorithms for Lyndon factorization, one optimized for small alphabets with run-skipping, and another for run-length encoded strings, both offering significant efficiency gains.

Contribution

It presents two novel algorithms for Lyndon factorization, enhancing speed for specific data types and encoding methods compared to existing algorithms.

Findings

01

The small alphabet algorithm is over ten times faster on DNA strings.

02

The run-length encoded algorithm computes Lyndon factorization in linear time.

03

Both algorithms outperform previous methods in their respective scenarios.

Abstract

We present two variations of Duval's algorithm for computing the Lyndon factorization of a word. The first algorithm is designed for the case of small alphabets and is able to skip a significant portion of the characters of the string, for strings containing runs of the smallest character in the alphabet. Experimental results show that it is faster than Duval's original algorithm, more than ten times in the case of long DNA strings. The second algorithm computes, given a run-length encoded string $R$ of length $ρ$ , the Lyndon factorization of $R$ in $O (ρ)$ time and constant space.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · DNA and Biological Computing · semigroups and automata theory