TL;DR
This paper introduces PQ-Tree Search, a new computational problem in genomics, and presents a parameterized algorithm and tool, PQFinder, to identify approximate gene cluster instances in genomes, demonstrating its effectiveness on prokaryotic genomes.
Contribution
The paper defines the NP-hard PQ-Tree Search problem, develops a parameterized algorithm, and implements a tool for finding gene clusters in genomes, advancing comparative genomics methods.
Findings
Identified 29 gene clusters rearranged in plasmids.
Demonstrated the tool's ability to find structural variants of gene clusters.
Provided publicly available code and data for reproducibility.
Abstract
We define a new problem in comparative genomics, denoted PQ-Tree Search, that takes as input a PQ-tree representing the known gene orders of a gene cluster of interest, a gene-to-gene substitution scoring function , integer parameters and , and a new genome . The objective is to identify in approximate new instances of the gene cluster that could vary from the known gene orders by genome rearrangements that are constrained by , by gene substitutions that are governed by , and by gene deletions and insertions that are bounded from above by and , respectively. We prove that the PQ-Tree Search problem is NP-hard and propose a parameterized algorithm that solves the optimization variant of PQ-Tree Search in time, where is the maximum degree of a node in and is used to hide factors polynomial in the input size.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
