Metagenome-assembled genomes enhance bacterial read decontamination and variant calling in oral samples
Zunu An, Jun Hyung Cha, Kyu Ha Lee, Insuk Lee

TL;DR
Using metagenome-assembled genomes improves DNA sequencing accuracy in saliva samples by better removing bacterial contamination.
Contribution
A new decontamination pipeline using metagenome-assembled genomes improves variant calling in oral DNA sequencing.
Findings
MAG-augmented decontamination outperforms conventional methods in variant calling accuracy.
Decontamination improves detection of variants in GC-rich genomic regions.
Some bacterial regions mimic human DNA, potentially causing genotyping errors.
Abstract
Whole genome sequencing (WGS) offers advantages over DNA chip-based genotyping, typically using blood-derived DNA. However, saliva and buccal samples—popular in direct-to-consumer tests—suffer reduced accuracy because of oral bacterial contamination. Decontamination strategies using decoy bacterial genomes yielded limited improvements, likely because they cover only a subset of oral bacteria with available isolate genomes. To overcome this, we developed a decontamination pipeline leveraging metagenome-assembled genomes (MAGs). Concordance analysis of variant calling between blood and matched oral samples confirmed the superiority of MAG-augmented decontamination over conventional methods relying mainly on isolate genomes. Although the underlying mechanism remains unclear, it particularly improves variant calls in GC-rich regions, recovering many likely pathogenic variants. Additionally,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRespiratory and Cough-Related Research · Infant Health and Development · Linguistics and Cultural Studies
