Assessing the best edit in perturbation-based iterative refinement   algorithms to compute the median string

P. Mirabal; J. Abreu; D. Seco

arXiv:1912.02217·cs.DS·December 6, 2019

Assessing the best edit in perturbation-based iterative refinement algorithms to compute the median string

P. Mirabal, J. Abreu, D. Seco

PDF

TL;DR

This paper introduces a new algorithm for finding median strings in biological sequence data that improves convergence speed over existing heuristics by better estimating the impact of perturbations.

Contribution

The paper presents a novel perturbation-based iterative refinement algorithm that outperforms current heuristics in median string computation speed.

Findings

01

The new algorithm converges faster than state-of-the-art methods.

02

Experimental results validate the improved convergence speed.

03

The approach maintains high quality of median string approximation.

Abstract

Strings are a natural representation of biological data such as DNA, RNA and protein sequences. The problem of finding a string that summarizes a set of sequences has direct application in relative compression algorithms for genome and proteome analysis, where reference sequences need to be chosen. Median strings have been used as representatives of a set of strings in different domains. However, several formulations of those problems are NP-Complete. Alternatively, heuristic approaches that iteratively refine an initial coarse solution by applying edit operations have been proposed. Recently, we investigated the selection of the optimal edit operations to speed up convergence without spoiling the quality of the approximated median string. We propose a novel algorithm that outperforms state of the art heuristic approximations to the median string in terms of convergence speed by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.