Approximating LCS and Alignment Distance over Multiple Sequences

Debarati Das; Barna Saha

arXiv:2110.12402·cs.DS·October 26, 2021

Approximating LCS and Alignment Distance over Multiple Sequences

Debarati Das, Barna Saha

PDF

TL;DR

This paper develops approximation algorithms for the complex problem of multiple sequence alignment, specifically for the longest common subsequence and alignment distance, achieving near-optimal results within feasible computational times.

Contribution

It introduces new approximation algorithms for LCS and AD of multiple sequences, improving runtime and approximation factors under certain conditions.

Findings

01

Approximate LCS within a factor of rac{ ext{lambda}^2 n}{2+ ext{epsilon}} in ilde{O}_m(n^{loor{rac{m}{2} floor+1}) time.

02

Approximate AD within a factor of 2 in ilde{O}_m(n^{ ext{ceil}rac{m}{2} floor}) time.

03

Below-2 approximation for AD achieved under specific pseudorandomness conditions.

Abstract

We study the problem of aligning multiple sequences with the goal of finding an alignment that either maximizes the number of aligned symbols (the longest common subsequence (LCS)), or minimizes the number of unaligned symbols (the alignment distance (AD)). Multiple sequence alignment is a well-studied problem in bioinformatics and is used to identify regions of similarity among DNA, RNA, or protein sequences to detect functional, structural, or evolutionary relationships among them. It is known that exact computation of LCS or AD of $m$ sequences each of length $n$ requires $Θ (n^{m})$ time unless the Strong Exponential Time Hypothesis is false. In this paper, we provide several results to approximate LCS and AD of multiple sequences. If the LCS of $m$ sequences each of length $n$ is $λn$ for some $λ \in [0, 1]$ , then in $\tilde{O}_{m} (n^{⌊ \frac{m}{2} ⌋ + 1})$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.