UITOTO: a software for generating molecular diagnoses for species descriptions
Ambrosio Torres, Leshon Lee, Amrita Srivathsan, Rudolf Meier

TL;DR
UITOTO is a new software that helps generate accurate molecular diagnoses for species by finding and validating specific genetic markers.
Contribution
UITOTO introduces a novel method for generating and validating molecular diagnoses using weighted random sampling and consensus strategies.
Findings
UITOTO outperforms existing tools like MOLD in classification accuracy using F1 Score metrics on large datasets.
The software identifies optimal diagnostic molecular combinations (DMCs) that balance specificity and length effectively.
A user-friendly Shiny App-GUI is provided for visualization and generating publication-quality DMCs.
Abstract
Millions of species remain undescribed, and each eventually will require a species description with a diagnosis. Yet, we lack software that can derive state‐specific and contrastive molecular diagnoses and allows the user to validate them based on all available sequences for the taxon under study. Here we introduce UITOTO, which addresses this shortcoming by facilitating the identification, testing, and visualization of diagnostic molecular combinations (DMCs). The software uses a weighted random sampling algorithm based on the Jaccard Index for building candidate DMCs. It then selects DMCs with the highest specificity stability, meeting user‐defined thresholds for exclusive character states. If multiple optimal DMCs are identified, UITOTO derives a majority‐consensus DMC. To verify whether the generated DMCs are contrastive, UITOTO includes a validation module that tests DMCs against…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEnvironmental DNA in Biodiversity Studies · Diptera species taxonomy and behavior · Forensic Entomology and Diptera Studies
