The distribution and spectrum of thalassemia variants in GUIYANG region, southern China
Xuanyin Zhao, Zhiyu You, Yunyan Deng, Yi Zhou, Dongyang Deng, Jian Quan, Fang Chen, Zhimei Yan, Ya Qi, Leilei Chen, Fang Xiang, Weixian Zheng, Ruyi Zhang

TL;DR
This study analyzed thalassemia gene testing results in Guiyang, southern China, finding high prevalence among Han ethnic groups and identifying new thalassemia variants.
Contribution
The study identified previously undocumented thalassemia variants and detailed their regional and ethnic distribution in southern China.
Findings
1401 individuals tested positive for thalassemia out of 20,478 tested in the region.
New thalassemia variants were discovered that had not been previously documented in medical literature.
The Han ethnic group showed the highest prevalence of thalassemia in the study area.
Abstract
Thalassemia is one of southern China’s most common inherited disorders. This retrospective study analyzed the results of thalassemia gene testing conducted on 20,478 individuals from January 1, 2019, to April 31, 2024 in the First Affiliated Hospital of Guizhou University of Traditional Chinese Medicine. The cohort consisted of 19,733 females and 745 males, with 1401 individuals testing positive for thalassemia. Among the positive cases, 942 had α thalassemia, 431 had β thalassemia, and 25 had variants in both α and β thalassemia genes. Interestingly, a subgroup of individuals with thalassemia variants not previously documented in medical literature was identified. The study highlighted the prevalence of thalassemia among different ethnic groups, with individuals of Han ethnicity being the most affected. Geographical analysis revealed a concentration of cases in Guizhou Province,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2- —http://dx.doi.org/10.13039/501100004001Guizhou Provincial Science and Technology Department
- —http://dx.doi.org/10.13039/501100009619Science and Technology Bureau, Guiyang Municipal Government
- —Education Department of Guizhou Provincial
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHemoglobinopathies and Related Disorders · Iron Metabolism and Disorders
Introduction
Thalassemia is a significant autosomal recessive blood disorder characterized by the disrupted synthesis of one or more globin proteins, which leads to alterations in both the structure and function of hemoglobin [1, 2]. This malformation compromises the oxygen-carrying capacity of erythrocytes, initiating varying degrees of hemolytic anemia [3]. The disorder predominantly manifests in warmer equatorial regions, with a notably high prevalence in the southern provinces of China, where specific genetic variations are more common [4]. Globally, it is estimated that approximately 400 million individuals are carriers of thalassemic variants, affecting populations in over 160 countries [5]. Thalassemia is categorized into three main forms: alpha (α-), beta (β-), and delta (δ-) thalassemia. Interestingly, a subgroup is distinguished by the specific globin gene variant involved [6].
Of these, α- and β-thalassemia are the most prevalent; α-thalassemia generally arises from gene deletions, while β-thalassemia is predominantly caused by point mutation [7].
The clinical manifestations of thalassemia can vary significantly, ranging from mild and asymptomatic cases to severe, life-threatening conditions, depending on the specific location and type of thalassemia gene variants. The synthesis of α chains is regulated by four genes (αα/αα), and the severity of clinical and analytical abnormalities depends on whether one, two, three, or all four α genes are absent. The deletion of a single gene is typically asymptomatic, whereas the deletion of all four genes can lead to hydrops fetalis, which is incompatible with life. In contrast, β chain synthesis is regulated by only two genes (β/β), resulting in a narrower range of clinical consequences: from asymptomatic carriers (β/0) to severe forms of hemolytic anemia, depending on the specific DNA variant affecting both β genes [8]. Individuals with thalassemia minor may show little to no signs of anemia, whereas severe forms of α-thalassemia can result in fetal or neonatal mortality, sometimes causing complications for the mother as well [9]. In contrast, children with β-thalassemia major develop pronounced anemia soon after birth, requiring lifelong management strategies, including regular blood transfusions and chelation therapy to manage iron overload [10].
Physically, significant cases of thalassemia are characterized by symptoms such as fatigue, weakness, shortness of breath, pallor, and splenomegaly among other complications [2, 4, 11]. The broader impact of thalassemia on families and communities highlights the necessity of robust public health strategies such as carrier screening and genetic counseling, aimed at reducing its incidence [12]. However, these public health measures are supported by significant advancements in molecular diagnostics, which have enhanced our understanding of the genetic underpinnings of thalassemia. Recent research developments have led to the identification of rare variants, furthering our knowledge and enabling targeted interventions [12].
Building on these insights, we have compiled and analyzed thalassemia detection data from Guiyang region, capital of Guizhou Province in South China. This analysis not only contributes to our understanding of regional genetic variability but also underscores the ongoing necessity for comprehensive public health initiatives and improved educational programs aimed at managing and reducing the burden of thalassemia globally.
Materials and methods
Population
This study was conducted at the First Affiliated Hospital of Guizhou University of Traditional Chinese Medicine where a total of 20,478 samples were collected for thalassemia gene screening from January 1, 2019, to March 31, 2024. Samples were obtained from routine prenatal screenings, physical examinations, and inpatient evaluations tailored for individuals suspected of having thalassemia. Additionally, samples displaying phenotypic traits suggestive of thalassemia, yet undiagnosed via the commercial diagnostic kit from Hybribio Company, were collected. Alternative diagnostic methods, such as electrophoresis and sequencing, were utilized to confirm the presence of thalassemia variants. This multifaceted diagnostic approach enabled a more comprehensive identification of thalassemic variants and improved the diagnostic accuracy through the integration of advanced molecular techniques, thereby enriching the understanding of thalassemia’s genetic epidemiology in the region.
Reagents and instruments
The reagents used included the Hybribio Blood Genomic DNA Extraction Kit and the Mediterranean Anemia Hybridization Reagent. The principal instrument employed was the HB2012A Nucleic Acid Molecular Hybridization Instrument from Guangdong Hybribio Biotech Company. For sample collection, 2 ml of venous blood was drawn from each participant into an EDTA anticoagulant tube and stored at 2–8 °C.
Flow-through hybridization and gene chip (FHGC)
The PCR mix was retrieved from the − 20 °C freezer 30 min beforehand and thawed at 4 °C before centrifugation. Similarly, the Taq enzyme, also stored at − 20 °C, was briefly centrifuged following retrieval. Both the PCR mix and the enzyme, after centrifugation, had 5 μl of DNA solution added, mixed, and centrifuged once more before being placed in the PCR machine for amplification. The PCR cycle parameters were set as follows: initial denaturation at 95 °C for 15 min, followed by 35 cycles of 97 °C for 50 s, 60 °C for 60 s, 72 °C for 120 s, with a final extension at 72 °C for 19 min. PCR products were stored at 4 °C after amplification.
For hybridization, PCR products were heated in the PCR machine at 95 °C for 10 min for denaturation and quickly cooled in an ice bath. The hybridization instrument was set up by placing the metal plate, separation membrane, and adding distilled water, followed by draining the water completely with the pump. Next, the hybridization membrane was arranged, topped with silicone rings and the separation chamber, and the cover snapped on. After draining the water completely using the pump, 0.8 ml of pre-warmed hybridization solution was added for pre-hybridization at 42 °C. This was followed by adding the denatured α-thalassemia and β-thalassemia PCR products for capillary hybridization with fresh hybridization solution for 30 min at 42 °C. Washing the hybridization membrane involved the use of warmed solution WB1, followed by a temperature reset to 25 °C for blocking and enzyme labeling, with subsequent washes and development steps until the final washing with distilled water and drying between absorbent paper sheets. Results were interpreted as per manual instructions, with ambiguous results sent to Guangdong Hybribio Biotech for sequencing verification.
Statistical analysis
Continuous variables were summarized using descriptive statistics, including means and ranges or standard deviations, while categorical variables were presented as counts and percentages. All statistical analyses were conducted using Microsoft office Excel and R software.
Results
Characteristics of the studied population
Between January 1, 2019, and March 31, 2024, our center conducted a thalassemia gene testing on a total of 20,478 individuals. As most samples come from pregnant women during prenatal screening, the study population consisted of 19,733 females and 745 males. The cohort included 19,928 individuals who were undergoing routine medical examinations and 550 hospitalized patients. Age distribution within the study sample revealed that 155 individuals were younger than 1 year, 686 were aged between 1 and 22 years, and 19,637 were older than 22 years.
In this retrospective analysis, it was found that 1401 individuals tested positive for the presence of the thalassemia gene, representing a positive rate of 6.84%. This group was predominantly female, with 1196 individuals testing positive, compared to 205 males. The positive cases were categorized as 1167 individuals who were part of the routine medical examination group and 234 who were hospitalized patients. The age distribution of positive cases indicated that 27 individuals were younger than 1 year, 113 were aged between 1 and 22 years, and 1261 were older than 22 years (Table 1 and Fig. 1).Table 1. Characteristics of the client populationAll of subjects (Number of people)Thalassemia positive (Number of people) (%)Female19,7331196 (6.06)Male745205 (27.52)Health examination19,9281167 (5.86)Hospitalization550234 (42.55)Age (years) < 115527 (17.42) > 1, ≤ 22686113 (16.47) ≥ 2219,6371261 (64.22)Fig. 1. Composition of study population included
Prevalence of thalassemia
Of the 1401 individuals identified as positive for thalassemia, a detailed analysis revealed that 942 individuals harbored α thalassemia, 431 individuals presented with β thalassemia, and 25 individuals manifested variants involving both α and β thalassemia genes. Among those with α thalassemia, the most prevalent variant was the 3.7 heterozygous variant, with subsequent occurrences of SEA deletion, α4.2 deletion, and α^CS^ variant. In the case of β thalassemia, the CD17 heterozygous variant emerged as the most common, followed by CD41-42, IVS-II-654, and βEM variant.
Of particular interest, a subgroup of seventeen individuals displayed thalassemia variants that had not been previously documented in medical literature. Within this subgroup, 5 individuals exhibited α thalassemia variant, 9 individuals displayed β thalassemia variant, and 3 individuals presented with variant involving both forms of thalassemia (Tables 2, 3, 4 and 5, supplement Tables 1 and 2).Table 2. Distribution of α-thalassemia genotypesGenotypeCounts (n)Percentages in α-thalassemiaHb (g/L)(mean ± SD)MCV (fL)(mean ± SD)MCH (pg/cell)(mean ± SD)-α^3.7^/deletion45948.99%123.06 ± 15.5985.28 ± 6.4728.14 ± 2.46_SEA/deletion22624.12%111.60 ± 19.4470.52 ± 6.3622.49 ± 2.36-α^4.2^/deletion899.50%125.04 ± 14.8986.27 ± 5.7828.39 ± 2.05-α^CS^/mutation828.75%110.95 ± 23.2483.12 ± 7.7626.84 ± 3.25-α^WS^/mutation232.45%121.32 ± 14.9183.60 ± 6.5227.60 ± 2.67-α^QS^/mutation131.39%110.85 ± 18.8577.27 ± 4.6224.51 ± 2.73αααanti4.260.64%100.00 ± 17.2873.30 ± 6.8623.10 ± 2.56HBA2:c.300 + 55T > G50.53%119.10 ± 14.4269.44 ± 9.3522.02 ± 4.41_SEA/-α^QS^40.43%72.00 ± NA68.00 ± NA17.90 ± NAHBA1:c.300 + 55G > T40.32%121.00 ± 12.4291.63 ± 1.6029.80 ± 1.56_SEA/-α^CS^30.32%73.33 ± 30.7367.20 ± 4.2219.77 ± 1.33THAI30.32%119.50 ± 10.6170.35 ± 1.7422.80 ± 1.76-α^3.7^/_SEA20.21%68.00 ± 3.0060.10 ± 7.4017.40 ± 1.40-α3.7/-α3.720.21%112.5 ± 0.7175.00 ± 0.8524.20 ± 0-α^3.7^/-α^CS^20.21%126.00 ± 6.0073.70 ± 3.9026.45 ± 4.35-α4.2/HBA1:c.223G > C10.11%143 ± NA78.9 ± NA26.3 ± NAHBA1:c.95 + 1G > A10.11%150.00 ± NA80.40 ± NA25.60 ± NA_SEA/HBA2:c.427T > C10.11%74.00 ± NA74.90 ± NA19.00 ± NAHBA1:c.104T > G10.11%123.00 ± NA86.70 ± NA29.60 ± NAHBA1:c.273G > C10.11%124.20 ± NA91.10 ± NA31.50 ± NAHBA1:c.84G > T10.11%127.00 ± NA79.80 ± NA25.30 ± NAHBA2:c.159T > G10.11%127.00 ± NA84.40 ± NA25.50 ± NAHBA2:c.218A > G10.11%135.00 ± NA76.50 ± NA24.40 ± NAHBA2:c.300 + 34G > A10.11%116.00 ± NA84.30 ± NA26.70 ± NAHBA2:c.377T > C10.11%^#^132.00 ± 1.80^#^75.20 ± 3.30^#^24.5 ± 0.50-α^QS^/αααanti3.710.11%95.00 ± NA79.40 ± NA22.40 ± NAHBA2:c.51G > T10.11%155.00 ± 28.2892.60 ± 1.1332.50 ± 0.85HBA2:c.80C > A10.11%124.00 ± NA85.70 ± NA30.00 ± NAαααanti3.710.11%99.00 ± NA84.20 ± NA26.90 ± NATotal937100NANANA^#^Data are derived from other literature or databasesTable 3Distribution of β-thalassemia genotypesGenotypeCounts (n)Percentages in β-thalassemia (%)Hb (g/L)(mean ± SD)MCV (fL)(mean ± SD)MCH (pg/cell)(mean ± SD)CD1715436.58104.48 ± 19.8668.83 ± 9.1621.95 ± 3.49CD41-4212529.69104.04 ± 17.6469.13 ± 9.5422.12 ± 3.56IVS-II-6546014.2596.81 ± 18.7069.39 ± 7.2721.88 ± 2.55βEM153.56118.37 ± 11.8183.60 ± 4.5127.75 ± 1.31CapM112.61129.00 ± 10.2692.12 ± 4.3730.99 ± 1.45CD4381.9093.27 ± 20.5671.41 ± 6.0822.18 ± 1.54HBB:c.341T > A81.90127.20 ± 17.6587.33 ± 3.3928.57 ± 1.21-28C/N61.43128.86 ± 7.5472.80 ± 4.9523.87 ± 1.25CD71-7251.1990.92 ± 14.2265.75 ± 4.8120.77 ± 1.47CD27-2840.95110.67 ± 10.9767.23 ± 1.2722.20 ± 0.87HBB:c.170G > A30.71125.50 ± 7.5093.07 ± 3.8631.50 ± 2.13HBB:c.162delT20.48111 ± NA58.1 ± NA18.8 ± NAHBB:c.180G > C20.48121.20 ± 7.4092.75 ± 1.2530.35 ± 1.05HBB:c.316-179A > C20.48101.50 ± 8.50100.25 ± 21.3531.80 ± 7.80-29C/N10.2496.00 ± NA76.10 ± NA26.40 ± NAHBB:c.341T > A10.24161 ± NA90.3 ± NA30.6 ± NAHBB:c.170G > A10.24124 ± NA97.2 ± NA31.2 ± NAHBB:c.100G > A10.24^#^122.00 ± NA^#^80.60 ± NA^#^26.4 ± NAHBB:c.113G > A10.24113 ± NA66.1 ± NA20.8 ± NAHBB:c.130G > C10.24127 ± NA83.2 ± NA27.8 ± NAHBB:c.246C > A10.24122 ± NA78.4 ± NA25.1 ± NAHBB:c.27G > C10.24^#^124.80 ± 0.79^#^94.78 ± 10.96^#^30.33 ± 3.7HBB:c.304G > C10.24^#^105.00 ± NA^#^82.70 ± NA^#^26.00 ± NAHBB:c.327C > T10.24115 ± NA75.9 ± NA23.5 ± NAHBB:c.364G > C10.24114 ± NA108.2 ± NA36.7 ± NAHBB:c.68A > C10.24156 ± NA84.4 ± NA29 ± NAβEM / CD27-2810.2494.20 ± NA63.10 ± NA20.30 ± NAHBB:c.92 + 2T > C10.24^#^96.00 ± NA^#^55.7 ± NA^#^17.8 ± NAHBB:c.315 + 5G > C10.24^#^94.5 ± NA^#^48.99 ± NA^#^15.81 ± NAHBB:c.316-146T > G10.24^#^121.00 ± NA^#^59.60 ± NA^#^18.60 ± NATotal421100NANANA^#^Data are derived from other literature or databasesTable 4Distribution of composite α- and β-thalassemia genotypesGenotypeCounts (n)Percentages in α-compositeβ-thalassemia (%)Hb (g/L) (mean ± SD)MCV (fL)(mean ± SD)MCH(pg/cell)αβ -α3.7CD41-42416.00115.00 ± 2067.00 ± 1.8022.05 ± 1.25-α3.7CD17312.0086.00 ± NA65.10 ± NA20.60 ± NA_SEACD41-4228.00NANANA_SEACD1728.00NANANA-α3.7IVS-II-65428.00126.00 ± NA92.90 ± NA32.00 ± NA-α3.7βEM14.00157 ± NA84.6 ± NA28.8 ± NA_SEAIVS-II-65414.00138 ± NA67.9 ± NA22 ± NA-αCSIVS-II-65414.00113 ± NA67.6 ± NA20.7 ± NA_SEACD27-2814.00117.00 ± NA66.50 ± NA22.70 ± NA-αCSCD41-4214.00109.00 ± NA67.70 ± NA21.10 ± NA_SEACD14-1514.00NANANA-αCSβEM14.00NANANA-α4.2CD1714.00NANANA_SEAHKαα14.00NANANAHBA2:c.-24C > GHBB:c.315 + 180T > C14.00116 ± NA78 ± NA25.3 ± NAHBA2:c.300 + 55T > GHBB:c.34G > A14.0097 ± NA8 1.50 ± NA24 ± NAHBA2:c.300 + 55T > GHBB:c.315 + 252C > T14.00NA73.3 ± NA23 ± NATotal25100NANANATable 5Genotypes of unreported thalassemiaHGVS nomenclatureMutation typeHb (g/L)MCV (fL)MCH (pg/cell)HBA1:c.300 + 35G > Aα-thalassemiaNA10537.4HBA2:c.193C > Tα-thalassemiaNA87.331.4HBA2:c.316C > Tα-thalassemiaNA73.823.8NC_000016.10:g.165401_184701del/ NC_000016.10:g.169818_174075del/HBA1:c.223G > Cα-thalassemia3586.824.3HBB:c.315 + 300-308AAAAAAAAA > AAAAAAAAα-thalassemia80122.439.8HBB:c.-153C > Aβ-thalassemiaNA66.520HBB:c.*34G > Aβ-thalassemiaNA74.122.4HBB:c.-107A > Cβ-thalassemiaNA80.422.4HBB:c.-23 A > Gβ-thalassemia10878.324.1HBB:c.315 + 299–308 AAAAAAAAA > AAAAAAAAβ-thalassemiaNA76.823.8HBB:c.315 + 300-308AAAAAAAAA > AAAAAAAAAAβ-thalassemia115.00 ± 5.0081.30 ± 9.6025.85 ± 5.25HBB:c.316–113 A > Tβ-thalassemia8265.717.6HBB:c.316–31 C > Tβ-thalassemia1107725.6HBB:c.315 + 300-308AAAAAAAAA > AAAAAAAAβ-thalassemiaNA58.413.5HBA1:c.196 G > A /HBB:c.-11-8delAAACα-composite β-thalassemiaNANANAHBA1:c.196G > A/HBB:c.52A > Tα-composite β-thalassemia12260.319.7NG_000006.1:g.34164_37967del3804/HBA2:c.427 T > C/HBB:c.52A > Tα-composite β-thalassemiaNANANA
Guizhou is home to many ethnic minorities, and a significant number of unique ethnic groups reside in the area, some of whom may carry distinctive thalassemia variant. Our retrospective analysis indicated that among those who tested positive for thalassemia, individuals of Han ethnicity constituted the highest proportion, followed by those from Buyi, Miao, and Tujia ethnic groups (Table 6).Table 6. The ethnic distribution of thalassemia in GuizhouEthnicα-thalassemia (%)β-thalassemia (%)Total composition (%)HAN725 (82.48)335 (93.31)1060 (85.62)BUYI18 (2.05)10 (2.79)28 (2.26)MIAO12 (1.37)6 (1.67)18 (1.45)TUJIA3 (0.34)3 (0.84)6 (0.48)YI3 (0.34)1 (0.28)4 (0.32)DONG2 (0.23)1 (0.28)3 (0.24)QILAO1 (0.11)2 (0.56)3 (0.24)SHUI1 (0.11)1 (0.28)2 (0.16)BAI1(0.11)0 (NA)1 (0.08)Total879 (100)359 (100)1238 (100)
Geographical distribution of the thalassemia population
To delve deeper into the spatial distribution of thalassemia carriers, we conducted an analysis whereby the household registration locations of affected individuals were utilized to generate a population distribution map. This geographical representation unveiled that the majority of individuals testing positive for thalassemia gene variants within our study cohort were predominantly concentrated in Guizhou Province, with additional clusters observed in adjacent provinces. Within the delineated 9 major administrative regions of Guizhou Province, it was discerned that the highest density of affected individuals was discerned in Guiyang city, with subsequent concentrations noted in Bijie and Qiannan Prefecture (Fig. 2).Fig. 2. The distribution of the positive population included in the research on the map
Discussion
Thalassemia is a group of inheritable single-gene disorders that pose a significant global public health concern due to their potential to result in fetal deaths or birth defects [13]. The two most common types of thalassemia are α- and β-thalassemia [14]. In α-thalassemia, genetic variants typically entails large deletions in the α-globin gene, with a smaller proportion caused by point mutation. Conversely, β-thalassemia is predominantly caused by point mutation in the β-globin gene. While most individuals with thalassemia are asymptomatic carriers of the thalassemia deletion gene, a minority present with microcytic hypochromic anemia. The primary treatments for severe manifestations of thalassemia include blood transfusions and iron chelation therapy, both of which are palliative measures that do not address the underlying cause of the disorder [15]. Consequently, patients with thalassemia require lifelong, continuous treatment, placing a substantial financial burden on families and society. Although promising gene therapies for thalassemia have emerged, their long-term safety and efficacy remain to be determined. The primary focus of thalassemia management involves increasing awareness and preventive measures [16]. By promoting education about the risks of thalassemia and encouraging widespread thalassemia screening, particularly among prospective parents, the incidence of moderate to severe forms of the disorder can be effectively reduced. In the Guiyang area, all pregnant women are screened for thalassemia in the first trimester. Some institutions perform hemoglobin electrophoresis screening, and if abnormal hemoglobin is detected, they proceed with genetic screening. Other institutions directly screen for thalassemia genes. If a pregnant woman is found to be a carrier of the thalassemia gene, her husband may also be screened upon a doctor's recommendation. If the husband is also found to carry the thalassemia gene, some pregnant women may receive a prenatal diagnosis after a doctor's assessment to check for the risk of severe thalassemia in fetus.
Currently, the detection of thalassemia primarily involves the utilization of FHGC, high-resolution melting analysis (HRMA), next-generation high-throughput sequencing, third-generation sequencing, gel electrophoresis, first-generation sequencing, and multiplex ligation-dependent probe amplification (MLPA) alongside validation [17, 18]. FHGC and HRMA are favored for their simplicity and cost-effectiveness in concurrently identifying multiple common mutation sites, rendering them essential tools in early thalassemia genetic screening. The advancement and enhancement of technology have propelled next-generation sequencing to the forefront in numerous clinical settings, improving throughput and encompassing a broader spectrum of mutation sites [19].
Third-generation sequencing, also referred to as long-read sequencing, signifies the latest breakthrough in DNA sequencing technology [20]. In contrast to traditional short-read sequencing approaches producing short DNA fragments typically measuring 100–300 base pairs, third-generation sequencing platforms can generate considerably longer reads, extending across thousands to tens of thousands of base pairs or even longer [21]. Although the utilization of third-generation sequencing can enhance the detection of rare variants sites, challenges such as high costs, the necessity for heightened precision, and extended reporting timelines currently limit its applicability in thalassemia genetic testing [22]. However, it is expected that as these obstacles are gradually overcome, third-generation sequencing will emerge as the predominant method for thalassemia gene analysis.
The approach employed for the identification of thalassemia gene mutations in this investigation incorporated FHGC in combination with agarose gel electrophoresis and first-generation sequencing for validation. Among the 20,478 subjects subjected to testing, 1401 individuals exhibited positive results for thalassemia gene mutations, yielding a positivity rate of 6.84%, which surpasses the national average carrier frequency of thalassemia standing at 3.62%. Within this cohort, 942 individuals were found to carry α-thalassemia variants, with a positivity rate of 4.60%, 431 individuals harbored β-thalassemia variants, reflecting a positivity rate of 2.11%, and 25 individuals presented with both α and β-thalassemia variants, presenting a positivity rate of 0.12%.
Comparative analysis of our findings with the documented carrier rates of 25% in individuals under the age of 18 and 3.28% in newborns within Guizhou Province reveals that our institution's positivity rate falls within this range [23, 24]. Discrepancies in positivity rates can be attributed to variances in the demographic composition of the study populations. Previous reports primarily focused on pediatric or neonatal populations, whereas our study cohort primarily consisted of healthy individuals undergoing routine medical assessments. This suggested that our data more accurately represent the prevalence of thalassemia carriers within the Guiyang region. Nevertheless, as our institution specializes in traditional Chinese medicine, the admission of patients with moderate to severe forms of thalassemia may be limited, potentially distorting the epidemiological profile of these patient groups. It is important to note that the majority of the population included in our study were women, as thalassemia screening is generally conducted among pregnant women, resulting in fewer men being involved. Additionally, 550 of the participants were hospitalized patients exhibiting symptoms related to thalassemia. These individuals, suspected of having thalassemia, underwent screening, which may cause the detection rate in this study to be slightly higher than the actual local detection rate. These problems may lead to statistical bias.
Through a retrospective examination, we identified 29 variations of α-thalassemia, 30 variants of β-thalassemia, 18 instances of combined α and β thalassemia, and 17 previously unreported variants. Contrary to previous assertions indicating _SEA as the predominant mutation in southern China [25], our investigation revealed that -α3.7 constituted the prevailing mutation (48.99%) within our subpopulation, followed by _SEA (24.12%) and -α4.2 heterozygosis (9.50%). Among the 28 identified β-thalassemia genotypes, the CD17 [codon 17 (A->T)] β0 heterozygote (36.58%), CD41-42 [41/42 (-TTCT)] β0 heterozygote (29.69%), and IVS-II-654 (C>T) β + heterozygote (14.25%) were the most prevalent variants, consistent with documented data from Guizhou Province but deviating from observations in Guangdong Province or Hainan Province [26, 27].
Furthermore, our study unveiled 17 novel mutation loci, 5 associated with α-thalassemia, 9 with β-thalassemia, and 3 with combined α and β thalassemia. Regrettably, due to constraints in data acquisition, comprehensive hematological parameters were only obtainable for carriers of 7 mutation sites, impeding the determination of the clinical significance of the remaining mutation loci.
The ethnically diverse landscape of Guizhou Province has fostered a unique spectrum of thalassemia variants. With the advent of modern transportation and increasing cultural integration, cohabitation and intermarriage among different ethnic groups have likely facilitated the spread of thalassemia carriers, including those with rare mutation sites. Our findings indicate that the majority of thalassemia carriers in the Guiyang region are from Han, Buyi, Miao, and Tujia ethnic backgrounds, reflecting their demographic prominence in the area. Given the ethnic diversity in Guizhou, we advocate for enhanced thalassemia screening to raise awareness about this disease.
Located in Guiyang, the capital of Guizhou Province, our center likely serves patients from across the province. By mapping thalassemia carriers based on their registered domiciles within Guizhou, we identified highest concentrations in Guiyang City, followed by Bijie City and Qiannan Prefecture. This is related to the baseline distribution of the population in the region. Additionally, thalassemia prevalence is notably higher in southern Chinese provinces like Guizhou, while significant carrier numbers are also observed in northern regions. However, consensus among experts regarding universal thalassemia screening in northern provinces remains lacking.
Advancements in thalassemia detection technology, from Sanger sequencing to next-generation methods, have significantly progressed. However, implementing large-scale screening remains economically challenging, considering regional disparities in China and variations in thalassemia epidemiology. Local health authorities must balance cost considerations with effective screening strategies tailored to local realities to minimize severe thalassemia cases.
Conclusions
This study elucidated the presence of thalassemia carriers at our center through molecular characterization, while also considering ethnicity and regional dispersion. Our findings furnish valuable insights into the distribution of thalassemia within the Guiyang area, thereby aiding clinicians in assessing the likelihood of carriers transmitting severe thalassemia to offspring. Such insights hold utility in shaping policies aimed at thalassemia prevention and control. Future screening efforts should consider including northern Chinese populations in routine screening to prevent treatment delays and reduce births of children with severe thalassemia.
Supplementary Information
Supplementary Material 1.
