Synthetic data-driven AI approach for fetal chromosomal aneuploidies detection
Changhoe Hwang, Krishna Prasad Adhikari, Gyeongin Oh, Sunshin Kim

TL;DR
This paper introduces a synthetic data-driven AI method to detect fetal chromosomal abnormalities, overcoming the challenge of limited real positive data.
Contribution
A novel synthetic data generation approach for fetal aneuploidy detection with high similarity to real data and improved model performance.
Findings
Synthetic data generation achieved >99.9% similarity to real data for fetal chromosomal aneuploidy detection.
Logistic regression models maintained 100% sensitivity and PPV for autosomal aneuploidies and ≥99.6% for sex chromosome aneuploidies.
The method demonstrated 100% accuracy on real positive fetal aneuploidy cases.
Abstract
A major limitation in the development of fetal chromosomal aneuploidy detection technologies lies in the scarcity of real positive data. To address this issue, we propose a novel methodology to generate virtually unlimited synthetic negative and positive datasets with >99.9% similarity to real data, enabling accurate detection of both autosomal chromosome aneuploidies (ACA) and sex chromosome aneuploidies (SCA). In terms of methods, blood samples from 15 999 pregnant women were analyzed, including 186 clinically confirmed positive cases. Using 701 high-confidence negatives as a reference, we designed algorithms for synthetic data generation. For negatives, multiple real FASTQ files were randomly merged, and fetal fraction (FF) was recalculated to reflect biological variability. For positives, chromosome-specific read counts were adjusted using numerical equations: ACAs were simulated by…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrenatal Screening and Diagnostics
