# Efficient identification of de novo mutations in family trios: a consensus-based informatic approach

**Authors:** Mariya Shadrina, Özem Kalay, Sinem Demirkaya-Budak, Charles A LeDuc, Wendy K Chung, Deniz Turgut, Gungor Budak, Elif Arslan, Vladimir Semenyuk, Brandi Davis-Dusenbery, Christine E Seidman, H Joseph Yost, Amit Jain, Bruce D Gelb

PMC · DOI: 10.26508/lsa.202403039 · Life Science Alliance · 2025-03-28

## TL;DR

This paper introduces a reliable automated method for identifying new genetic mutations in family trios using a consensus-based approach, reducing the need for manual checks.

## Contribution

A novel consensus-based informatic workflow for high-precision de novo variant detection in genome sequencing trios.

## Key findings

- Consensus filtering achieved 98.0–99.4% precision in identifying de novo variants.
- The method reached 99.4% sensitivity when validated against manually confirmed variants.
- Validation on the Genome-in-a-Bottle trio showed 99.2% precision and 96.6% sensitivity.

## Abstract

Researchers developed a highly precise and accurate approach for identifying de novo genetic variants in probands from trio genome sequencing, making it suitable for automated large-scale analyses.

Accurate identification of de novo variants (DNVs) remains challenging despite advances in sequencing technologies, often requiring ad hoc filters and manual inspection. Here, we explored a purely informatic, consensus-based approach for identifying DNVs in proband–parent trios using short-read genome sequencing data. We evaluated variant calls generated by three sequence analysis pipelines—GATK HaplotypeCaller, DeepTrio, and Velsera GRAF—and examined the assumption that a requirement of consensus can serve as an effective filter for high-quality DNVs. Comparison with a highly accurate DNV set, validated previously by manual inspection and Sanger sequencing, demonstrated that consensus filtering, followed by a force-calling procedure, effectively removed false-positive calls, achieving 98.0–99.4% precision. At the same time, sensitivity of the workflow based on the previously established DNVs reached 99.4%. Validation in the HG002-3-4 Genome-in-a-Bottle trio confirmed its robustness, with precision reaching 99.2% and sensitivity up to 96.6%. We believe that this consensus approach can be widely implemented as an automated bioinformatics workflow suitable for large-scale analyses without the need for manual intervention, especially when very high precision is valued over sensitivity.

## Full-text entities

- **Genes:** ARHGAP26 (Rho GTPase activating protein 26) [NCBI Gene 23092] {aka GRAF, GRAF1, OPHN1L, OPHN1L1}
- **Diseases:** DNVs (MESH:D005862), Autism (MESH:D001321), Cancer (MESH:D009369), PCGC (MESH:D006331), CHD (MESH:D006330), GS (MESH:D042822), Mendelian genetic diseases (MESH:D030342)
- **Chemicals:** DNV (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11953573/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11953573/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/PMC11953573/full.md

---
Source: https://tomesphere.com/paper/PMC11953573