Complexity of modification problems for best match graphs
David Schaller, Peter F. Stadler, Marc Hellmuth

TL;DR
This paper investigates the computational complexity of modifying best match graphs (BMGs) to adhere to their properties, showing NP-completeness of key problems and providing ILP formulations based on new characterizations.
Contribution
It introduces novel characterizations of BMGs using triples and forbidden subgraphs, and proves NP-completeness of arc modification problems for BMGs.
Findings
Arc deletion, completion, and editing problems are NP-complete.
BMGs can be characterized by triples and forbidden subgraphs.
Integer linear programs can solve these problems.
Abstract
Best match graphs (BMGs) are vertex-colored directed graphs that were introduced to model the relationships of genes (vertices) from different species (colors) given an underlying evolutionary tree that is assumed to be unknown. In real-life applications, BMGs are estimated from sequence similarity data. Measurement noise and approximation errors usually result in empirically determined graphs that in general violate characteristic properties of BMGs. The arc modification problems for BMGs aim at correcting such violations and thus provide a means to improve the initial estimates of best match data. We show here that the arc deletion, arc completion and arc editing problems for BMGs are NP-complete and that they can be formulated and solved as integer linear programs. To this end, we provide a novel characterization of BMGs in terms of triples (binary trees on three leaves) and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
