From enrichment to interpretation: PS4-driven reclassification in Taiwanese inherited retinal degeneration

Yu-Shu Huang; Chien-Yu Lin; Yu-An Chen; Chieh-Yu Lee; Chang-Hao Yang; Jacob Shujui Hsu; Ta-Ching Chen; Pei-Lung Chen

PMC · DOI:10.1186/s40246-026-00923-0·February 15, 2026

From enrichment to interpretation: PS4-driven reclassification in Taiwanese inherited retinal degeneration

Yu-Shu Huang, Chien-Yu Lin, Yu-An Chen, Chieh-Yu Lee, Chang-Hao Yang, Jacob Shujui Hsu, Ta-Ching Chen, Pei-Lung Chen

PDF

Open Access

TL;DR

This study uses population data to improve the classification of genetic variants in inherited retinal diseases, enhancing diagnostic accuracy in a Taiwanese cohort.

Contribution

A case–control framework using ancestry-matched data enables PS4-driven reclassification of IRD variants.

Findings

01

PS4 reclassified two variants from Likely Pathogenic to Pathogenic and six from VUS to Likely Pathogenic.

02

Exemplar variants showed strong genotype–phenotype concordance, validating the workflow.

03

The framework is reproducible and applicable to other monogenic disorders.

Abstract

Inherited retinal degeneration (IRD) comprises a diverse group of monogenic disorders characterized by marked genetic and phenotypic heterogeneity. Although next-generation sequencing (NGS) enables the identification of candidate variants, many remain classified as variants of uncertain significance (VUS). Ancestry-matched population data can strengthen comparative evidence, and the emergence of national biobanks provides new opportunities to operationalize ACMG/AMP criterion PS4 through case–control analyses. We integrated an IRD cohort of 802 probands with whole-genome allele frequency data from 1,492 individuals in the Taiwan Biobank. An allele-based case–control framework was applied, assigning PS4 when the Haldane–Anscombe–corrected odds ratio was ≥ 5 and the 95% confidence interval excluded 1. Post-PS4 triage required variants to: (i) reside in IRD-associated genes, (ii) be rare…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Genes3

CNGB1 CFAP410 TAS2R18P

Proteins2

Diseases2

monogenic disorders IRD

Mutations2

c.340_351dupc.2921T > G

Figures2

Click any figure to enlarge with its caption.

Effect of PS4 evidence on ACMG/AMP classifications. Comparison of variant classifications before (teal) and after (salmon) applying PS4 evidence derived from case–control allele enrichment against Taiwan Biobank whole-genome reference data. Incorporation of PS4 reduced the number of variants of uncertain significance (VUS) and increased Likely pathogenic and Pathogenic designations following manual curation. Exact counts are listed in Supplementary Tables S3 and S4.

Pedigrees, retinal imaging, and molecular validation for homozygous IRD variants. A, *CNGB1*-affected cases with variant [NM_001297.5](NM_001297.5):c.2921T > G. Pedigrees of two unrelated families carrying the homozygous *CNGB1* variant are shown, with probands indicated by arrows. Representative fundus autofluorescence images from the probands display diffuse pigmentary changes with macular involvement, findings consistent with a rod–cone dystrophy pattern. Sanger sequencing of proband N996 confirms *CNGB1*([NM_001297.5](NM_001297.5)):c.2921T > G, with the altered nucleotide indicated; a cont

Funding2

—https://doi.org/10.13039/501100005762National Taiwan University Hospital
—https://doi.org/10.13039/100020595National Science and Technology Council

Keywords

Inherited retinal degenerationPS4 (case–control enrichment)ACMG/AMP classificationAncestry-matched controlsTaiwan biobankVariant interpretation

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRetinal Development and Disorders · Genomics and Rare Diseases · Genetic Neurodegenerative Diseases

Full text

Introduction

Inherited retinal degeneration (IRD) comprises a group of monogenic disorders characterized by progressive loss of photoreceptors and retinal pigment epithelium, ultimately leading to visual impairment and blindness [1]. Pathogenic variants have been identified in more than 300 genes [2], giving rise to a wide clinical spectrum that includes rod–cone dystrophy, cone–rod dystrophy, macular dystrophy, Leber congenital amaurosis, and syndromic forms of retinal disease [3]. This diversity underscores the marked genotypic and phenotypic heterogeneity of IRD. In recent years, increasingly large IRD cohorts have been reported across diverse populations [4–6]. In parallel, many countries have established population biobanks that provide ancestry-resolved allele frequencies at a population scale [7–9].

The convergence of well-phenotyped disease cohorts and national biobanks offers new opportunities for ancestry-matched case–control analyses, in which patients serve as cases and biobank participants as population-matched controls. Within the ACMG/AMP variant interpretation framework [10], such analyses enable the application of the PS4 criterion, which evaluates variant enrichment in affected individuals relative to controls. In principle, ancestry-matched enrichment analyses can provide critical evidence to support reclassification of variants of uncertain significance (VUS), complementing segregation and functional data. However, despite its conceptual importance, PS4 remains underutilized in IRD variant interpretation, particularly in non-European populations. The lack of ancestry-matched control data and the absence of reproducible analytical workflows have limited the systematic implementation of PS4 in routine clinical genomics, leaving many potentially pathogenic variants unresolved.

Building on prior work from the Taiwan Inherited Retinal Degeneration Project (TIP), which characterized the genetic landscape and epidemiology of a national IRD cohort [11], the TIP dataset expanded to include 802 probands by June 2024. For ancestry-matched population frequencies, we leveraged high-coverage whole-genome sequencing data from the Taiwan Biobank, comprising 1,492 Taiwanese individuals [12]. In this study, we integrate a national IRD cohort with a population-matched biobank to systematically apply PS4 in a Taiwanese context. Our objectives are to quantify the incremental contribution of PS4 to ACMG/AMP variant classification and to establish a reproducible, ancestry-aware workflow that can be adapted to other monogenic disease programs with comparably sized cohorts.

Materials and methods

Patients

Between July 2015 and June 2024, we recruited 1,182 individuals from 937 families with suspected IRD. After comprehensive ophthalmic evaluation, including best-corrected visual acuity, electroretinography, color fundus photography, optical coherence tomography, and fundus autofluorescence imaging, 138 cases from 135 families were excluded. Exclusion criteria included non-IRD clinical diagnoses, atypical or unilateral phenotypes inconsistent with IRD, and incomplete clinical, genetic, or consent data that precluded reliable interpretation. Peripheral blood samples were collected from all participants after obtaining informed consent. Genomic DNA was extracted from peripheral blood mononuclear cells using the Gentra Puregene Blood Kit (QIAGEN, Hilden, Germany).

TIP panel sequencing and data processing

All probands underwent targeted sequencing using an IRD gene panel developed by the Taiwan Inherited Retinal Degeneration Project (TIP), as previously described [11]. Four panel versions were used. Versions 1 to 3 targeted 233 monogenic IRD genes, while version 4 targeted 182 genes (Supplementary Table S1). Version 4 was streamlined to improve cost-effectiveness by removing genes that had not contributed to any confirmed diagnoses in earlier iterations. A detailed breakdown of panel versions, including gene content, capture completeness, and the number of samples sequenced with each version, is provided in Supplementary Table S2. Captured libraries were sequenced in paired-end mode on Illumina MiSeq or NextSeq 550 instruments (Illumina, San Diego, CA, USA). Raw FASTQ files were processed according to the Genome Analysis Toolkit (GATK) best practices using the Illumina DRAGEN Bio-IT Platform (v4.2). The workflow included alignment to the GRCh38 reference genome, duplicate marking, base-quality recalibration, and germline single nucleotide variant and indel calling. Per-sample genomic Variant Call Formats (gVCFs) were generated and jointly genotyped across the cohort using DRAGEN’s joint-calling workflow to produce a multi-sample Variant Call Format (VCF). Variant filtering was performed using DRAGEN’s built-in algorithms, and sequencing depth as well as coverage metrics were summarized. All variants were subsequently annotated with ANNOVAR [13].

PS4 computation (case–control enrichment)

We assessed the ACMG/AMP criterion PS4 by evaluating whether each variant was enriched in cases compared with controls using an allele-based 2 × 2 contingency framework. For each variant, case-side allele counts (AC), allele numbers (AN), and allele frequencies (AF) were parsed from a VCF-derived key–value field. The reference-allele count in cases was defined as the total number of alleles observed in cases after subtracting the number of alternate alleles. Control allele counts and allele numbers were obtained directly from the Taiwan Biobank whole-genome sequencing dataset (TWB1492) [12] using variant-level, empirically observed AC and AN values. These control AC and AN values therefore reflect site-specific call rates and missingness rather than uniform or inferred denominators. Control reference-allele counts were calculated by subtracting the alternate-allele count from the total control allele number. For variants with an observed control alternate-allele count of zero, site-level callability was examined to confirm that the absence of alternate alleles reflected true absence in the control cohort rather than insufficient sequencing coverage. Odds ratios were computed using Haldane–Anscombe–corrected counts, and two-sided 95% confidence intervals as well as p-values were obtained using Fisher’s exact test. To account for multiple testing across all variants subjected to PS4 case–control analysis, q-values were calculated using the Benjamini–Hochberg false discovery rate (FDR) procedure. FDR-adjusted q-values were used as a sensitivity analysis to assess the robustness of PS4-enriched variants, but were not applied as mandatory thresholds for PS4 assignment, consistent with ACMG/AMP 2015 guidelines. For variants located on chromosome X, we additionally performed sex-stratified PS4 analysis. Male (XY) and female (XX) samples were analyzed separately, with allele counts, allele numbers, and odds ratios calculated independently for each sex to account for sex-specific ploidy and potential imbalance in cohort composition.

Variant prioritization after PS4 enrichment

After identifying PS4-enriched variants, defined as those with an odds ratio ≥ 5 and a 95% confidence interval that excluded 1 [10], we applied the following prioritization steps. We first retained variants captured by the TIP IRD gene panel and removed those not mapped to genes included in the panel. Variants present at an allele frequency greater than 0.01 in East Asian populations in either the gnomAD genome or exome datasets (v4.1) were excluded. Based on RefSeq annotations, we retained variants classified as exonic, untranslated regions (UTR), or splicing (± 20 bp), and removed variants assigned to other categories. For all variants that passed these filters, pathogenicity assessment followed ACMG/AMP guidelines. We used the GeneBe platform (https://genebe.net/) to generate evidence tags and subsequently performed manual review for each variant to adjudicate the evidence, adding PS4 where applicable. Final ACMG classifications were determined after manual review and summarized in Supplementary Tables S3 and S4.

Results

PS4-driven enrichment and impact on classification

Using the predefined PS4 framework, we identified variants in the TIP cohort that were significantly enriched in cases compared with population-matched controls, followed by post-PS4 filtering as described in the Methods. This two-step triage markedly reduced the number of candidate variants while retaining those with plausible molecular and clinical relevance. The impact of PS4 on categorical variant interpretation is summarized in Fig. 1 and detailed in Supplementary Tables S3 and S4. Baseline ACMG/AMP evidence tags and provisional classifications were first generated using GeneBe without PS4, after which variants were manually reviewed to assess whether case–control enrichment, considered together with phenotype and inheritance consistency, supported reclassification.

Fig. 1. Effect of PS4 evidence on ACMG/AMP classifications. Comparison of variant classifications before (teal) and after (salmon) applying PS4 evidence derived from case–control allele enrichment against Taiwan Biobank whole-genome reference data. Incorporation of PS4 reduced the number of variants of uncertain significance (VUS) and increased Likely pathogenic and Pathogenic designations following manual curation. Exact counts are listed in Supplementary Tables S3 and S4.

Incorporation of PS4 led to clinically meaningful upgrades for a subset of IRD-associated variants. Two variants originally classified as Likely pathogenic were upgraded to Pathogenic (CYP4V2:c.802_810delinsGC and CYP4V2:c.1507G > C). In addition, six variants that were classified as Variant of Uncertain Significance (VUS) in the baseline GeneBe-based workflow were upgraded to Likely pathogenic after incorporation of PS4 evidence, including CFAP410:c.340_351dup, EYS:c.5644 + 5G > A, USH2A:c.13339 A > G, CNGB1:c.2921T > G, REEP6:c.223G > A, and USH2A:c.2802T > G (Table 1; Supplementary Table S3). By contrast, the proportions of Benign and Likely benign classifications remained largely unchanged. Collectively, these findings demonstrate that incorporation of PS4 provides important additional evidence for a subset of variants and improves classification resolution in this panel-sequenced IRD cohort. For variants located on chromosome X, additional sex-stratified PS4 analysis was performed to account for sex-specific ploidy and cohort composition, yielding more biologically appropriate enrichment signals for X-linked variants, including RS1(NM_000330):c.531T > G (Table 1; Supplementary Table S4). As a sensitivity analysis, false discovery rate (FDR) adjusted q-values were calculated across all variants subjected to PS4 testing. A substantial subset of PS4-enriched variants remained significant after FDR correction (q < 0.05; Supplementary Table S5), supporting the robustness of these findings under multiple testing control.

Table 1. Clinically classified variant enrichment: TIP cohort vs. Taiwan biobankGeneVariantCase AC/AN(TIP cohort)Control AC/AN(Taiwan Biobank)OR (95% CI)ACMG classification^a^ClinVar^b^ EYS NM_001142800.2:c.7492G > C13/16040/298450.6 (3.01-852.33)PP/LP PROM1 NM_006017.3:c.139del10/16040/298439.3 (2.30-671.24)PP/LP ABCA4 NM_000350.3:c.5761G > A6/16040/298424.3 (1.37-431.15)PP/LP RS1 NM_000330.4:c.531T > G3/262 (XY)0/532 (XX)0/744 (XY)1/1496 (XX)20.1 (1.03-390.13)0.9 (0.04–23.02)PP/LP ABCA4 NM_000350.3:c.1804 C > T15/16041/298419.4 (3.63-103.77)PP/LP RDH12 NM_152443.3:c.505 C > G12/16041/298415.6 (2.87–85.01)PP/LP EYS NM_001142800.2:c.6416G > A62/16048/298414.2 (6.91–29.13)PP/LP EYS NM_001142800.2:c.8012T > A9/16041/298411.8 (2.11–66.34)PP/LP ABCA4 NM_000350.3:c.2894 A > G12/16042/29849.4 (2.41–36.46)PP USH2A NM_206933.4: c.4576G > A10/16042/29847.9 (1.98–31.25)PP EYS NM_001142800.2: c.7228 + 1G > A18/16044/29847.7 (2.75–21.68)PP/LP RCBTB1 NM_018191.4:c.170del5/14821/29847.4 (1.22–45.08)PP/LP USH2A NM_206933.4:c.9570 + 1G > A9/16042/29847.1 (1.76–28.65)PP ABCA4 NM_000350.3: c.101_106del5/16041/29846.8 (1.12–41.64)PP RLBP1 NM_000326.5:c.282del18/16046/29845.3 (2.18–13.09)PP/LP CYP4V2 NM_207352.4:c.802-8_810delinsGC34/160412/29845.2 (2.73–10.01)LP > PP/LP BBS10 NM_024685.4:c.445dup6/16040/298424.3 (1.37-431.15)PConflicting POC1B NM_172240.3:c.419dup6/16040/298424.3 (1.37-431.15)PNo reported CYP4V2 NM_207352.4:c.1507G > C6/16041/29848.1 (1.37–47.79)LP > PVUS CFAP410 NM_004928.3:c.340_351dup7/10900/298441.3 (2.36-724.07)VUS > LPVUS EYS NM_001142800.2:c.5644 + 5G > A16/16041/298420.7 (3.88-110.04)VUS > LPP/LP USH2A NM_206933.4: c.13339 A > G9/16042/29847.1 (1.76–28.65)VUS > LPConflicting CNGB1 NM_001297.5:c.2921T > G8/16042/29846.4 (1.55–26.05)VUS > LPConflicting REEP6 NM_138393.4:c.223G > A5/10402/29846.3 (1.42–28.30)VUS > LPConflicting USH2A NM_206933.4:c.2802T > G22/16047/29845.6 (2.46–12.93)VUS > LPConflictingAC: allele count, AN: allele number, OR: odds ratio, P: pathogenic, LP: likely pathogenic, VUS: variant of uncertain significance^a^ACMG classification was assigned using GeneBe following the ACMG/AMP criteria. ^b^ClinVar status is based on the ClinVar database update on 2025-07-21

ClinVar concordance and cohort-specific refinements

To contextualize Table 1, and recognizing that ClinVar evidence is incorporated into the baseline GeneBe classification through PP5, we additionally compared PS4-enriched variants with current ClinVar entries (accessed July 21, 2025) to assess concordance with publicly available interpretations and to identify variants for which cohort-specific PS4 evidence provides refinement beyond existing submissions. Among these variants, 17 already had ClinVar submissions designating them as Pathogenic or Likely pathogenic, which was consistent with our classifications. Notably, 10 of these 17 variants were located in three genes that are most frequently implicated in Taiwanese IRD cohorts: EYS, USH2A, and ABCA4 [11]. In contrast, eight variants were listed in ClinVar as VUS, conflicting, or not reported. Three of these, BBS10:c.445dup, POC1B:c.419dup, and CYP4V2:c.1507G > C, were classified by our analysis pipeline as Likely pathogenic or higher even before PS4 evidence was incorporated, whereas ClinVar assertions were uncertain or absent. Across the TIP cohort, eight patients carried one of these three variants. Two individuals were homozygous for BBS10:c.445dup, three carried CYP4V2:c.1507G > C (a variant modeled to disrupt an α-helical segment and impair protein function) [14], and three were diagnosed with POC1B-related disease. These were the only POC1B diagnoses in the cohort, and all three affected individuals harbored the c.419dup variant as one of the disease-associated alleles, indicating that this variant is a recurrent pathogenic allele in this population. Overall, these comparisons indicate that most high-confidence variants align with existing ClinVar designations, whereas a substantial subset benefits from PS4-supported evidence and cohort-specific observations. This additional evidence enables reclassification of previously uncertain or conflicting variants toward clinically actionable interpretation.

Homozygous cases supporting PS4-driven reclassification

To further clarify the contribution of monoallelic and biallelic carriers to PS4-enriched variants, we summarized the diagnostic status of probands carrying variants listed in Table 1, including monoallelic and biallelic configurations, as shown in Supplementary Fig. S1. Here, we focus on homozygous exemplars that provided strong genotype–phenotype corroboration beyond statistical enrichment. Among the six variants that advanced from VUS to Likely pathogenic after PS4 integration, our cohort included homozygous carriers of two variants, CNGB1:c.2921T > G and CFAP410:c.340_351dup, which offered direct clinical and molecular support for reclassification. For two additional variants (USH2A:c.13339 A > G and REEP6:c.223G > A), orthogonal validation and phenotype concordance were confirmed in representative cases and are summarized in Supplementary Fig. S2. For the remaining two variants (USH2A:c.2802T > G and EYS:c.5644 + 5G > A), molecular and clinical evidence has been established in our previous studies and was therefore not duplicated here [15, 16], but incorporated into manual ACMG/AMP curation.

Two unrelated families in our cohort carried the same homozygous CNGB1:c.2921T > G variant, with pedigrees and probands shown in Fig. 2A (top). In both probands, fundus autofluorescence imaging demonstrated diffuse pigmentary change with macular involvement (Fig. 2A, middle), and each reported night blindness, which is characteristic of CNGB1-associated retinopathy. In proband N996, Sanger sequencing of CNGB1 exon 28 confirmed the c.2921T > G substitution (Fig. 2A, right), providing independent validation of the next-generation sequencing finding. Additional validation using the Integrative Genomics Viewer (IGV) is provided for the two independent CNGB1-diagnosed cases in Supplementary Fig. S3 to further support reproducibility [17]. The homozygous state was consistent with the known recessive inheritance pattern of CNGB1 retinopathy, and the clinical presentation, characterized by night blindness with compatible imaging, supported phenotype specificity (PP4). Together with case–control enrichment (PS4), these findings justified reclassification from VUS to Likely pathogenic in our analysis.

Fig. 2. Pedigrees, retinal imaging, and molecular validation for homozygous IRD variants. A, CNGB1-affected cases with variant NM_001297.5:c.2921T > G. Pedigrees of two unrelated families carrying the homozygous CNGB1 variant are shown, with probands indicated by arrows. Representative fundus autofluorescence images from the probands display diffuse pigmentary changes with macular involvement, findings consistent with a rod–cone dystrophy pattern. Sanger sequencing of proband N996 confirms CNGB1(NM_001297.5):c.2921T > G, with the altered nucleotide indicated; a control chromatogram is provided for comparison. B, CFAP410-affected case with variant NM_004928.3:c.340_351dup. The pedigree of a family harboring the homozygous CFAP410 small insertion–duplication is shown. Color fundus photographs from the proband demonstrate bilateral retinal abnormalities. Short-read alignment visualized in the Integrative Genomics Viewer (IGV) validates CFAP410(NM_004928.3):c.340_351dup, with supporting reads spanning the duplicated segment

In a separate family, the proband carried a homozygous CFAP410:c.340_351dup variant (Fig. 2B, left) and exhibited early-onset disease with nystagmus. Bilateral color fundus photographs showed retinitis pigmentosa (RP)-like pigmentary change with relative macular preservation (Fig. 2B, middle), and full-field electroretinography demonstrated flat cone responses with generalized photoreceptor dysfunction. Taken together, these findings are consistent with a rod–cone dystrophy (RCD) pattern. Short-read alignment in the IGV revealed a read-supported 12-bp duplication at the annotated locus (Fig. 2B, right), confirming the variant at the sequence level. These clinical and molecular features closely mirror those reported for CFAP410-associated retinopathy [18], supporting genotype–phenotype concordance and, together with PS4 enrichment, enabling reclassification from VUS to Likely pathogenic.

These two homozygous exemplars, each consistent with the expected recessive inheritance model and supported by retinal imaging and independent sequencing, demonstrate how PS4-driven enrichment, when integrated with orthogonal molecular validation and clinical context, can shift variants across interpretive thresholds and lead to clinically meaningful classifications in IRD panel sequencing. Importantly, such reclassification from VUS to Likely Pathogenic provides a definitive molecular diagnosis for affected individuals, which is a prerequisite for genetic counseling and for consideration of gene-specific management or clinical trial eligibility in inherited retinal disorders.

Discussion

This work provides one of the earliest systematic evaluations of the operational application of PS4 for IRD variant interpretation in a population-matched, panel-sequenced cohort of modest size. Although our case number is not large by international standards, an 802-case cohort is substantial and clinically representative within Taiwan. Pairing it with 1,492 ancestry-matched controls enabled robust case–control enrichment. By prespecifying thresholds (odds ratio ≥ 5 with the 95% confidence interval excluding 1), implementing joint genotyping, and applying a light and transparent post-PS4 triage, we demonstrate that a clearly defined, ancestry-matched PS4 workflow can contribute to classification refinement even when sample sizes are constrained. This process facilitates systematic evaluation of uncertain variants toward reportable findings while maintaining specificity. Our experience offers a practical template for centers with limited caseloads: use high-quality local controls, standardize PS4 computation and evidence mapping, and integrate enrichment with phenotype and inheritance checks.

Importantly, these observations underscore that PS4 does not operate in isolation and must be interpreted alongside population frequency criteria and other ACMG/AMP evidence. Among the variants that met PS4 criteria, the largest proportion was ultimately classified as Benign or Likely benign (Fig. 1), including a subset with apparently elevated odds ratios. Upon re-review, most of these high-odds-ratio sites also satisfied BA1 or BS1 population criteria, which appropriately outweighed PS4. Notably, BA1 and BS1 are typically evaluated against the highest reliable population frequency, such as gnomAD popmax, and against a disease-specific maximum credible allele frequency, which is often well below 1% for Mendelian disorders [19]. As a result, a variant may appear rare in East Asians and pass our post-PS4 East Asian frequency filter but still be common in another ancestry or exceed the disorder-specific ceiling. This pattern can yield a Benign or Likely benign classification despite an inflated odds ratio in our cohort. Cohort scale can also influence this effect. With 802 cases and 1,492 controls, stochastic variation and subtle population stratification can exaggerate odds ratios for otherwise tolerated alleles. Supporting this interpretation, a recent Japanese IRD guideline recommends that ancestry-matched cohorts include more than 20,000 individuals to achieve robust PS4 assessment. Larger sample sizes help stabilize effect estimates and reduce spurious enrichment caused by small numbers and substructure [20]. Consequently, while our design is useful for hypothesis generation and triage, larger ancestry-matched resources combined with popmax-aware and condition-specific frequency thresholds will be essential to reduce the proportion of high-odds-ratio variants ultimately classified as Benign or Likely benign. Such expansion will also support the use of graded PS4 strengths with greater confidence.

Evolving guidance in the field underscores that there is no single universal strategy for operationalizing PS4 across diseases. When well-powered case–control data are unavailable, several ClinGen Variant Curation Expert Panels (VCEPs) assign PS4 using affected-proband counts with gene- or phenotype-specific thresholds, as exemplified by approaches adopted in RASopathy [21]. In IRD, disease- and gene-specific adaptations of PS4 have likewise been proposed, particularly for X-linked conditions such as RPGR-associated disease, where conventional case–control designs may be impractical [22]. In addition to proband-count-based approaches, recent large-scale studies have demonstrated that PS4 can be calibrated directly from quantitative enrichment data derived from ancestry-matched case-control cohorts. In the context of hereditary hearing loss, odds-ratio- and likelihood-ratio-based thresholds have been derived from large numbers of cases and controls, enabling the assignment of graded PS4 strengths (supporting, moderate, and strong) according to observed variant frequencies and suggesting that such quantitative frameworks may be applicable to other monogenic disorders [23]. Importantly, comparative analyses across multiple disease domains have further shown that the odds-ratio thresholds required to reach a given PS4 tier differ by phenotype, indicating that PS4 calibration should be tailored to specific disease contexts rather than applied uniformly across conditions [24]. Collectively, these studies highlight that PS4 is inherently context-dependent, influenced by gene, phenotype, genetic architecture, and the availability of appropriately matched control data. This perspective provides a strong rationale for our ancestry-matched, explicitly defined PS4 framework in IRD, which leverages population-specific controls to generate interpretable and reproducible enrichment evidence within a clinically relevant setting.

This study has some limitations. First, although we used a standardized DRAGEN/GRCh38 pipeline with joint genotyping, differences in panel versions and sequencing batches may affect variant callability and result in non-uniform allele numbers (AN). Such inconsistencies can introduce minor biases in case-control counts and odds-ratio estimates. This impact is likely greater for intronic or intergenic variants due to more variable capture efficiency and coverage; therefore, detailed analyses of intronic or intergenic variants were not included in this study. Future work using whole-genome sequencing may enable more robust assessment of these regions. Second, statistical power remains a challenge in rare variant analysis. As shown in our sensitivity analyses, stringent multiple-testing correction (such as FDR) may reduce the apparent significance of ultra-rare variants, even when they show strong effect sizes. To address this, variants that did not pass FDR thresholds were interpreted only when supported by independent evidence, including segregation, or consistent clinical phenotypes. Larger ancestry-matched cohorts will be required in future studies to enable more definitive statistical evaluation of such rare variants without reliance on additional supporting data. Third, we applied a fixed PS4 criterion (odds ratio ≥ 5 with the 95% confidence interval excluding 1) following the 2015 ACMG/AMP guidelines. Emerging disease-specific frameworks recommend phenotype-dependent OR thresholds and graded PS4 strengths, meaning some clinically relevant variants might be under- or misclassified. Adopting these refined approaches in future analyses could improve classification performance.

In conclusion, this study demonstrates that a practical, ancestry-matched PS4 workflow can enhance variant interpretation in IRD, particularly in mid-sized cohorts. By integrating predefined enrichment criteria with phenotype and inheritance concordance, this workflow reduces the burden of variants of uncertain significance and improves the clinical utility of panel-based sequencing. The framework is readily adoptable by other IRD programs and extensible to additional monogenic disorders with well-characterized case cohorts and appropriate population reference data. Future efforts should focus on expanding ancestry-matched reference datasets, harmonizing technical metrics such as per-variant allele numbers, and implementing disease-aware, graded PS4 thresholds to further improve diagnostic yield and clinical impact.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1.

Supplementary Material 2.

Bibliography1

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Ret Net. Disease Table [Internet]. [cited 2023 Mar 29];Available from: https://web.sph.uth.edu/Ret Net/disease.htm