Complete Characterization of Incorrect Orthology Assignments in Best   Match Graphs

David Schaller; Manuela Gei{\ss}; Peter F. Stadler; Marc Hellmuth

arXiv:2006.02249·q-bio.PE·November 30, 2020

Complete Characterization of Incorrect Orthology Assignments in Best Match Graphs

David Schaller, Manuela Gei{\ss}, Peter F. Stadler, Marc Hellmuth

PDF

TL;DR

This paper characterizes false-positive orthology assignments in best match graphs, providing a polynomial-time algorithm to identify most incorrect assignments solely based on graph structure, improving orthology inference accuracy.

Contribution

It introduces a method to detect unambiguous false-positive orthology edges in best match graphs without relying on gene or species trees.

Findings

01

At least 75% of incorrect orthology assignments can be detected.

02

Provides a polynomial-time algorithm for identifying false positives.

03

Results depend only on the structure of best match graphs.

Abstract

Genome-scale orthology assignments are usually based on reciprocal best matches. In the absence of horizontal gene transfer (HGT), every pair of orthologs forms a reciprocal best match. Incorrect orthology assignments therefore are always false positives in the reciprocal best match graph. We consider duplication/loss scenarios and characterize unambiguous false-positive (u-fp) orthology assignments, that is, edges in the best match graphs (BMGs) that cannot correspond to orthologs for any gene tree that explains the BMG. Moreover, we provide a polynomial-time algorithm to identify all u-fp orthology assignments in a BMG. Simulations show that at least $75%$ of all incorrect orthology assignments can be detected in this manner. All results rely only on the structure of the BMGs and not on any a priori knowledge about underlying gene or species trees.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.