# Comprehensive evaluation of ACMG/AMP-based variant classification tools

**Authors:** Tohid Ghasemnejad, Yuheng Liang, Khadijeh Hoda Jahanian, Milad Eidi, Arash Salmaninejad, Seyedeh Sedigheh Abedini, Fabrizzio Horta, Nigel H Lovell, Thantrira Porntaveetus, Mark Grosser, Mahmoud Aarabi, Hamid Alinejad-Rokny

PMC · DOI: 10.1093/bioinformatics/btaf623 · Bioinformatics · 2026-02-13

## TL;DR

This study compares tools that help interpret genetic variants using standard guidelines, finding that tools integrating phenotypic data perform better.

## Contribution

The study provides the first comprehensive benchmark of ACMG/AMP-based variant classification tools using expert-curated datasets.

## Key findings

- LIRICAL and Franklin showed the highest top-10 variant prioritization accuracy in Mendelian disorders.
- Tools with phenotypic integration outperformed those relying on genomic features alone.
- Bootstrap confidence intervals and Friedman tests validated the statistical significance of results.

## Abstract

The American College of Medical Genetics and Genomics/Association for Molecular Pathology (ACMG/AMP) guidelines represent the gold standard for clinical variant interpretation. Despite the widespread adoption of ACMG/AMP guidelines, a comprehensive comparison of the software tools designed to implement them has been lacking. This represents a significant gap, as clinicians require evidence-based guidance on which tools to use in their practice.

We benchmarked four ACMG/AMP-based tools (Franklin, InterVar, TAPES, Genebe) selected from 22 tools, and compared their performance with LIRICAL, a top-performing phenotype-driven tool, using 151 expert-curated datasets from Mendelian disorders. Selection criteria included free availability, VCF compatibility, operational reliability, and not being disease-specific. Our evaluation framework assessed top-N accuracy (N = 1, 5, 10, 20, 50), retention rates, precision, recall, F1 scores, and area under the curve (AUC). Statistical validation employed bootstrap confidence intervals (n = 1000) and Friedman tests. LIRICAL (68.21%) and Franklin (61.59%) demonstrated superior top-10 variant prioritization accuracy in Mendelian disorders, significantly outperforming other tools (P = .0000). Results demonstrate that tools with advanced phenotypic integration significantly outperform those relying primarily on genomic features.

All data and source code required to reproduce the findings of this study are openly available in the Code Ocean repository at https://doi.org/10.24433/CO.6562438.v1.

Graphical Abstract

## Full-text entities

- **Diseases:** Mendelian disorders (MESH:D025861)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12916173/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12916173/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/PMC12916173/full.md

---
Source: https://tomesphere.com/paper/PMC12916173