Approximating Weighted Duo-Preservation in Comparative Genomics
Saeed Mehrabi

TL;DR
This paper introduces a generalized, weighted version of the duo-preservation problem in comparative genomics, providing a polynomial-time 6-approximation algorithm to maximize duo preservation considering proximity.
Contribution
It formulates the MWDSM problem incorporating duo weights and proximity, and presents the first polynomial-time approximation algorithm for this complex problem.
Findings
Developed a polynomial-time 6-approximation algorithm for MWDSM
Extended duo-preservation concepts to include proximity weights
Bridged gap between unweighted and weighted duo-preservation in genomics
Abstract
Motivated by comparative genomics, Chen et al. [9] introduced the Maximum Duo-preservation String Mapping (MDSM) problem in which we are given two strings and from the same alphabet and the goal is to find a mapping between them so as to maximize the number of duos preserved. A duo is any two consecutive characters in a string and it is preserved in the mapping if its two consecutive characters in are mapped to same two consecutive characters in . The MDSM problem is known to be NP-hard and there are approximation algorithms for this problem [3, 5, 13], but all of them consider only the "unweighted" version of the problem in the sense that a duo from is preserved by mapping to any same duo in regardless of their positions in the respective strings. However, it is well-desired in comparative genomics to find mappings that consider preserving duos…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
