Serum microRNA expression quantitative trait loci in children with asthma colocalize with asthma-related GWAS results
Julian Hecker, Anshul Tiwari, Rinku Sharma, Kevin Mendez, Jiang Li, Sofina Begum, Qingwen Chen, Albert Smith, Juan C. Celedón, Scott T. Weiss, Rachel S. Kelly, Jessica A. Lasky-Su, Kelan G. Tantisira, Michael McGeachie

TL;DR
This study finds genetic links to microRNA expression in children with asthma that overlap with known asthma risk genes.
Contribution
Identifies novel miRNA-QTLs in asthma and shows their overlap with asthma-related GWAS results.
Findings
28 significant cis-miRNA-QTL associations were identified, 12 of which are novel.
Three novel miRNA-QTLs were replicated in an independent cohort.
The QTLs overlap with asthma-related GWAS hits and tissue-specific expression QTLs.
Abstract
Asthma poses a significant public health burden. Despite identifying more than a hundred genetic risk loci in genome-wide association studies (GWAS), the underlying functional mechanisms remain poorly understood. Studying omics, especially microRNAs (miRNAs), is a promising approach to facilitate our understanding of the biological pathways of asthma. Here, we performed miRNA expression quantitative trait loci (miRNA-QTL) analyses using whole-genome sequencing and serum-based miRNA expression data from two independent cohorts of children with asthma (Genetic Epidemiology of Asthma in Costa Rica Study (GACRS), (NCT00021840, 2005-06-23) (N = 980, Discovery) and the Childhood Asthma Management Program (CAMP) (NCT00000575, 2005-06-23) (N = 354, Replication)). Our robust discovery analysis identified 28 significant cis-miRNA-QTL associations, where 12 were not reported in three independent…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2- —https://doi.org/10.13039/100000050U.S. Department of Health & Human Services | NIH | National Heart, Lung, and Blood Institute (NHLBI)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCancer-related molecular mechanisms research · MicroRNA in disease regulation · IL-33, ST2, and ILC Pathways
Introduction
Asthma is a common chronic respiratory disease that affects more than 300 million people worldwide and imposes a significant global health burden^1–4^. Although recent large-scale genome-wide association studies (GWAS) identified more than a hundred genetic associations with asthma^5,6^, the underlying functional mechanisms of asthma remain poorly understood and are hypothesized to comprise a complex interplay between genetic, early life, and environmental factors^7^. Moreover, asthma is considered an umbrella diagnosis covering several phenotypes with distinct biological pathways and clinical manifestations^8,9^, increasing the complexity of understanding and studying the etiology.
Multiple omics levels have been studied to disentangle the biological pathways of asthma, including genomics, metabolomics, epigenomics, and transcriptomics. Recently, non-coding RNAs, specifically microRNAs (miRNAs), and their effects on the immune system have attracted more attention in asthma research^10^. MiRNAs are small non-coding RNA molecules ranging from 18 and 22 nucleotides acting as post-transcriptional regulators of target gene expression by interacting with messenger RNA and playing a significant role in regulating epithelial cell and inflammatory processes^11^. These processes can contribute to airway remodeling and obstruction^10,12^ and implicate miRNAs as essential biomarkers and research targets for understanding asthma development. Furthermore, miRNAs are gaining attention as a therapeutic tool to restore cell functions that are altered in the development of diseases^13^.
Previous work demonstrated that genetic variants regulate the expression levels of miRNAs in plasma and blood^14,15^. These genetic variants are referred to as miRNA expression quantitative trait loci (miRNA-QTLs). The identification of miRNA-QTLs has the potential to link the genetic factors associated with asthma and related traits to biological mechanisms through miRNAs.
In this study, we sought to identify serum miRNA-QTLs in children with asthma. Our analyses utilized whole-genome sequencing (WGS) data for the two family-based childhood asthma cohorts ‘Genetic Epidemiology of Asthma in Costa Rica Study’ (GACRS)^16^ and the Childhood Asthma Management Program (CAMP)^17^. We performed cis- and trans-analyses using a discovery and replication approach. The identified miRNA-QTLs were further investigated for colocalization with expression QTLs, splicing QTLs, and asthma-related GWAS results.
Results
The workflow of our analyses is visualized in Supplementary Fig. 1.
Study populations with WGS and miRNA data
After quality control, we had 980 GACRS offspring and 354 CAMP offspring samples with both WGS and miRNA expression data at our disposal. The baseline characteristics of both study cohorts are described in Table 1. The general baseline factors were comparable between GACRS and CAMP, except for race/ethnicity and lung function. CAMP subjects are ethnically/racially diverse and had slightly lower lung function measurements, while GACRS subjects are of Hispanic (Costa Rican) ancestry. The total number of miRNAs in GACRS and CAMP successfully mapped to genomic positions after quality control was 316 and 264, respectively. The overlap consisted of 252 miRNAs. The number of cis-miRNA windows in the miRNA-QTL analyses was 361 in GACRS and 303 in CAMP, reflecting the mapping of miRNAs to multiple genomic positions as obtained based on recent miRbase (v22) data. A total of 5,872,254 and 5,295,172 biallelic common autosomal genetic variants were available for analysis in GACRS and CAMP after quality control, respectively.Table 1. Characteristics of the children with asthma included in the analysisGACRSCAMPp-valueN980354Sex (female %)40%41%p = 0.87Age (mean (sd) in years)9.25 (1.87)8.99 (2.17)p = 0.05Height (mean (sd) in meters)1.33 (0.12)1.34 (0.14)p = 0.10FEV_1_ (mean (sd) in liter)1.78 (0.50)1.65 (0.46)p = 2.00×10^-5^**FEV_1_% predicted (mean (sd))98.63 (17.08)93.29 (13.06)p = 2.64×10^-9^**Maternal asthma history (yes/no/unknown)298/678/4 (30%/69%/1%)98/249/7 (28%/70%/2%)p = 0.46Paternal asthma history (yes/no/unknown)253/716/11 (26%/73%/1%)72/261/21 (20%/74%/6%)p = 0.12EthnicityHispanic (100%)280 (79%) European American/ 58 (16%) African American/ 16 (5%) Hispanicp* < 10^−16^**based on chi-squared-test.**based on t test.
Cis-miRNA QTL analysis
The GACRS cis-analysis incorporated 321 cis-miRNA windows with overlapping genetic variants. We identified a total of 28 top-cis-miRNA-QTLs in GACRS with an FDR q-value q<=0.05 (Table 2) using tensorQTL^18^. The corresponding miRNAs are located across different chromosomes, but notably, 11 miRNAs are on chromosome 14 in the 14q32 cluster^19^. For the replication analysis, 22 of the 28 implicated miRNAs were available in the CAMP analysis. In particular, 20 of the 28 variant-miRNA pairs were directly available in CAMP for replication analysis, and one additional variant-miRNA pair was tested using a proxy SNP in high LD (for hsa-miR-181a-5p, we tested rs2360962 in CAMP instead of rs7529925). One of the 22 miRNAs had no information regarding the implicated genetic variants available in CAMP (hsa-miR-3679-5p, top cis-miRNA QTL and LD proxies had MAF < 5% in CAMP) and was, therefore, not considered in the replication analysis. We observed that 7 of these pairs had a nominally significant association p-value p< =0.05 in CAMP. For four pairs, the p-value was significant after correcting for 21 tests using a Bonferroni correction at the significance level of 0.05, with a consistent direction of effect. The corresponding miRNAs were hsa-miR-4433b-3p, hsa-miR-3131, hsa-miR-181a-5p, and hsa-miR-335-3p. Here, hsa-miR-3131 showed strong genetic associations in both GACRS and CAMP, and a colocalization analysis between both cis-QTL statistics resulted in a high posterior probability of a consistent signal (colocalization posterior probability > =0.95), visualized in Fig. 1. The relationship between the top-cis-miRNA-QTL, rs6751804, and hsa-miR-3131 expression is illustrated in Supplementary Fig. 2. We also observed strong evidence for a shared genetic factor for the miRNAs hsa-miR-127-3p, hsa-miR-136-3p, hsa-miR-370-3p, and hsa-miR-409-3p on 14q32, displayed in Supplementary Figs. 3 and 4 (colocalization posterior probability 0.89). As a sensitivity analysis, we investigated the differences between the corresponding nominal association p-values for the 28 top-cis-miRNA-QTL signals when incorporating the top five or top ten miRNA expression principal components derived from the ReFACTor approach^20^ (see Methods). The results remained qualitatively unchanged. The same was observed when replacing the principal components with the factors derived from the non-negative matrix factorization (see Methods). Additional sensitivity analyses stratified by age group (<=9 years of age vs. > 9 years of age) showed no significant differences in both GACRS and CAMP after Bonferroni correction. This also holds true for interaction analyses between QTL and the first principal component of genetic ancestry. The 28 cis-miRNA-QTL associations are described in full detail in Supplementary Table 1. Furthermore, all 28 miRNAs were found to be expressed in lung and lung bronchus tissue (Supplementary Table 6) in the miRNATissueAtlas2^21^. Finally, we performed a meta-analysis based on the cis-miRNA-QTL statistics between GACRS and CAMP, for all overlapping miRNAs. The results of this meta-analysis did not yield any additional findings.Fig. 1. LocusCompare plot of cis-miRNA-QTL statistics for hsa-miR-3131 in GACRS and CAMP data. -log10(P) denotes the negative logarithm of the nominal association p-values. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${r}^{2}$$\end{document} Values report the strength of linkage disequilibrium between the lead variant (purple) and the corresponding other genetic variant; colors indicate the strength of linkage disequilibrium. This plot was created using the LocusCompareR R package.Table 2. Significant top-cis-miRNA-QTL results in GACRS (FDR q-value < =0.05)miRNACHRvariantAFp-value nominalq-valuereplication CAMPreplication Nikpay et al.replication Huan et al.replication Mustafa et al.hsa-miR-181a-5p1rs75299250.724.65E−101.50E−04YesYesYes****Yeshsa-miR-4433b-3p2rs759823810.185.47E−144.92E−08Y****esNANoNAhsa-miR-3679-5p2rs101753830.112.24E−132.00E−07NANoNoNohsa-miR-31312rs67518040.523.21E−466.55E−36Y****esNoNoNohsa-miR-5774rs357368520.122.84E−274.33E−19NANoNoNohsa-miR-584-5p5rs360470.491.58E−071.30E−02NoY****esNoYeshsa-miR-550a-3-5p7rs12300470.395.91E−071.36E−02NANoNoYeshsa-miR-335-3p7rs792576630.181.72E−117.52E−06Y****esNoNoNohsa-miR-873-3p9rs13689980.103.53E−081.52E−03NoNoNoNohsa-miR-873-5p9rs22113820.301.04E−073.59E−03NANoNoYeshsa-miR-1908-5p11rs1745560.439.86E−109.95E−05NoY****esNoYeshsa-miR-100-5p11rs1404713790.373.85E−119.88E−06NoY****esNoNohsa-miR-625-5p14rs352874940.055.31E−064.85E−02NANoNoNohsa-miR-625-3p14rs21278700.781.37E−087.95E−04NoNAYesNAhsa-miR-493-5p14rs619932610.254.61E−071.27E−02NANoNoNohsa-miR-127-3p14rs49004710.502.06E−083.03E−03NoYesYes****Yeshsa-miR-432-5p14rs125887180.472.04E−063.52E−02NoNoNoYeshsa-miR-136-3p14rs80129700.425.70E−082.57E−03NoNoNoYeshsa-miR-370-3p14rs128828150.531.97E−077.35E−03NoNoNoYeshsa-miR-411-3p14rs124329550.756.67E−071.46E−02NANoYes****Yeshsa-miR-654-5p14rs356384460.232.58E−063.37E−02NoNoNoNohsa-miR-382-5p14rs80129700.423.12E−064.26E−02NoNoYes****Yeshsa-miR-323b-3p14rs65758120.329.76E−126.87E−06NoYesNoYeshsa-miR-409-5p14rs42437290.461.65E−063.34E−02NoNoNoNohsa-miR-409-3p14rs128817600.411.07E−063.34E−02NoNoYes****Yeshsa-miR-361517rs99095460.071.19E−062.65E−02NoNANoNAhsa-miR-1-3p20rs606407280.203.46E−111.25E−05NoNANoNAhsa-miR-94120rs24275550.449.41E−113.21E−05NoYesYesYesReplication information for CAMP, Nikpay et al.^14^, Huan et al.^15^, and Mustafa et al.^22^ data are shown in the last four columns. NA: miRNA/proxy variants not available/not included in corresponding study. For the Huan et al. study, ‘no’ reflects either non-available miRNAs or no replication.AF allele frequency, FDR False Discovery Rate, CHR Chromosome.
Trans-miRNA QTL analysis
A trans-analysis using tensorQTL identified three significant trans-associations involving the miRNAs hsa-miR-148a-3p, hsa-let-7g-5p, and hsa-miR-15a-5p. These associations did not replicate in CAMP. Notably, all corresponding genetic variants in the GACRS analysis are low-frequency variants. The results are reported in Supplementary Table 2; further research is required to confirm or refute the observed associations.
Replication using results from previously published miRNA-QTL analyses
We investigated whether the significant cis-miRNA-QTL associations were previously reported in three external miRNA-QTL studies.
Overall, 24 of the 28 miRNAs implicated by our observed cis-miRNA-QTL associations were also investigated by Nikpay et al.^14^ A total of 7 cis-miRNA-QTL associations showed evidence for replication (Table 2). The results for hsa-miR-181a-5p, hsa-miR-584-5p, hsa-miR-1908-5p, hsa-miR-323b-3p, and hsa-miR-941, demonstrated replication based on the variant-miRNA association p-value (directly or using a proxy genetic variant) and showed evidence for colocalization (colocalization posterior probability > =0.8). For hsa-miR-100-5p and hsa-miR-127-3p, we observed evidence for colocalization of signals, but no proxy variant was available.
Huan et al.^15^ reported significant cis-QTL findings for 10 of the 28 miRNAs implicated by our cis-miRNA-QTL associations. A total of 7 of our specific 28 cis-miRNA-QTL associations demonstrated replication (Table 2). The corresponding miRNAs were hsa-miR-181a-5p, hsa-miR-625-3p, hsa-miR-127-3p, hsa-miR-411-3p, hsa-miR-382-5p, hsa-miR-409-3p, and hsa-miR-941. As for the Nikpay et al. data, 24 out of 28 miRNAs were analyzed by Mustafa et al.^22^. For 14 out of these 24 miRNAs, we observed evidence for replication, as reported in Table 2.
Overall, 12 out of the 28 cis-miRNA-QTL associations were not reported in any of the three independent external miRNA-QTL-studies.
Colocalization analysis with eQTLs
The colocalization analysis with eQTLs was based on Trans-Omics for Precision Medicine (TOPMed) data and identified several top-cis-miRNA-QTLs colocalizing with cis-eQTLs in asthma-related tissues (Table 3). We observed colocalization between the cis-miRNA-QTL statistics for hsa-miR-1908-5p and cis-eQTL statistics for FADS1 in lung tissue, PBMCs, and nasal epithelial cells, and FADS3 in T-cells. Related, the cis-eQTL statistics for TMEM258 colocalized with cis-miRNA-QTL statistics for hsa-miR-1908-5p in monocytes, PBMCs, and T-cells. Furthermore, we detected colocalization between cis-eQTL statistics for MEG3 and cis-miRNA-QTL statistics for multiple miRNAs in the 14q32 cluster in PBMC and whole blood. Next, UCKL1 and hsa-miR-941 cis-statistics colocalized in PBMCs and T-cells. The cis-miRNA QTL signals for hsa-miR-335-3p overlapped with cis-eQTLs for MEST in monocytes, nasal epithelial cells, PBMCs, T-cells, and whole blood. Additional tissue/cell-type-specific colocalizations were observed between SH3TC2 and hsa-miR-584-5p in lung, and MIR100HG/UBASH3B and hsa-miR-100-5p in nasal epithelial cells. All observed eQTL colocalizations are reported in Supplementary Table 3.Table 3. Results of the colocalization analyses between top-cis-miRNA-QTL-associations and eQTLs and sQTLsmiRNACHReQTL colocalizationsQTL colocalizationhsa-miR-181a-5p1––hsa-miR-4433b-3p2––hsa-miR-3679-5p2––hsa-miR-31312–NHEJ1 in lung;hsa-miR-5774––hsa-miR-584-5p5SH3TC2 in lung;–hsa-miR-550a-3-5p7––hsa-miR-335-3p7MEST in monocytes; MEST in nasal epithelial cells; MEST in PBMC; MEST in T-cells; MEST in whole blood;-hsa-miR-873-3p9––hsa-miR-873-5p9––hsa-miR-1908-5p11FADS1 in lung; TMEM258 in monocytes; FADS1 in nasal epithelial cells; TMEM258 in PBMC; FADS1 in PBMC; TMEM258 in T-cells; FADS3 in T-cells;FADS1 in lung; FADS2 in lung; FADS2 in monocytes; FEN1 in lung; TMEM258 in nasal epithelial cells; TMEM258 in PBMC; TMEM258 in T-cells; TMEM258 in whole blood;hsa-miR-100-5p11MIR100HG in nasal epithelial cells; UBASH3B in nasal epithelial cells;–hsa-miR-625-5p14––hsa-miR-625-3p14LINC02324 in PBMC; LINC02324 in whole blood;–hsa-miR-493-5p14––hsa-miR-127-3p14MEG3 in PBMC; MEG3 in whole blood;PPP2R5C in PBMC; PPP2R5C in T-cells; PPP2R5C in whole blood;hsa-miR-432-5p14––hsa-miR-136-3p14MEG3 in PBMC; MEG3 in whole blood;–hsa-miR-370-3p14MEG3 in PBMC; MEG3 in whole blood;–hsa-miR-411-3p14––hsa-miR-654-5p14––hsa-miR-382-5p14––hsa-miR-323b-3p14MEG3 in PBMC; MEG3 in whole blood;-hsa-miR-409-5p14––hsa-miR-409-3p14MEG3 in PBMC; MEG3 in whole blood;-hsa-miR-361517––hsa-miR-1-3p20–CRMA in lung;hsa-miR-94120DNAJC5 in nasal epithelial cells; UCKL1 in PBMC; UCKL1 in T-cells;TPD52L2 in lung; TPD52L2 in T-cells; UCKL1 in whole blood;eQTL expression quantitative trait loci, sQTL splicing quantitative trait loci, PBMC peripheral blood mononuclear cells, CHR Chromosome.
Colocalization analysis with sQTLs
Investigating the TOPMed splicing QTL results, we observed an overlap between multiple genetic factors underlying miRNAs and sQTLs (Table 3). This includes hsa-miR-1908-5p and FADS1, FADS2, and FEN1 in lung tissue, and TMEM258 in nasal epithelial cells, PBMCs, T-cells, and whole blood. The cis-miRNA-QTL statistics for hsa-miR-127-3p colocalized with sQTLs for PPP2R5C in PBMCs, T-cells, and whole blood. We also observed colocalization between cis-QTLs for hsa-miR-941 and TPD52L2 in lung and T-cells. The cis-miRNA-QTL statistics for the miRNA implicated by the strongest hit in our QTL analysis, hsa-miR-3131, overlapped with sQTL statistics for NHEJ1 in lung tissue. The complete results of the colocalization analysis with sQTLs are reported in Supplementary Table 4.
Colocalization analysis with GWAS summary statistics
We investigated the overlap between the detected cis-miRNA-QTLs and GWAS signals for blood cell traits^23^, lung function^24^, childhood/adult-onset asthma^6^, and a large-scale asthma meta-analysis^5^. Previously, significant associations between blood cell indices and autoimmune diseases such as asthma were observed^25^, including eosinophil counts, neutrophil counts, lymphocyte counts, and monocytes. We obtained the GWAS summary statistics for 28 out of 29 blood cell traits based on European ancestry participants in the UK Biobank^26^, as provided by Vuckovic et al. All traits and the respective results of the colocalization analyses are reported in Supplementary Table 5 and visualized in Fig. 2. The cis-miRNA-QTL statistics for hsa-miR-181a-5p colocalized with the GWAS summary statistics for eosinophil, platelet, and red blood cell counts. Moreover, the hsa-miR-1908-5p cis-QTL signal showed strong colocalization with the summary statistics for neutrophil counts and asthma. The latter was specific to the meta-analysis by Tsuo et al.^5^ and the adult-onset analysis by Ferreira et al.^6^; it was not observed for the corresponding childhood-onset analysis. Plateletcrit GWAS statistics colocalized with the cis-miRNA-QTL statistics for hsa-miR-100-5p and hsa-miR-625-3p. Finally, the cis-QTL statistics for miRNAs in the 14q32 cluster colocalized with several blood cell traits, including neutrophil counts. No colocalizations with lung function GWAS hits were observed.Fig. 2. Illustration of the colocalization analysis results between the 28 GACRS top-cis-miRNA-QTL statistics and GWAS summary statistics for asthma-related and blood cell traits.Colored tiles visualize the posterior probability of colocalization. PL: Platelets; LC: Lymphocytes; PS: Phils; LF: Lung function; AST: Asthma; RC: Red cells. This plot was created using the ggplot2 R package.
Discussion
We performed robust miRNA-QTL analyses using serum miRNA expression levels and whole-genome sequencing data in two studies of childhood asthma. The discovery analysis was based on samples from the Genetic Epidemiology of Asthma in Costa Rica Study, and the results were tested for replication using data from the Childhood Asthma Management Program. We discovered 28 miRNAs with significant top-cis-miRNA-QTLs in GACRS, and four of these top-cis-miRNA-QTL associations were replicated in CAMP. The trans-analysis identified three significant trans-miRNA-QTL associations in GACRS that require further follow-up studies to confirm their validity. Comparing the identified cis-miRNA-QTL associations with the results of three previous relevant miRNA-QTL studies, we observed an overlap of 16 miRNA-QTL-signals, leaving 12 signals that were not reported in these studies. Notably, the cis-QTL association of hsa-miR-3131, with strong replication in CAMP, is novel, to the best of our knowledge.
Previous studies demonstrated that miRNAs are vital biomarkers and research targets for understanding asthma development^10^. Establishing the link between genetic variants and miRNAs, especially in a target population of children with asthma, facilitates this process. Thus, miRNA-QTL associations that potentially appear specific to children (with asthma), such as the signals related to hsa-miR-3131, are of increased interest. Multiple miRNAs linked to the identified cis-miRNA-QTLs were described in prior studies of asthma. In particular, hsa-miR-181a was linked to asthmatic airways and airway remodeling^27^. Moreover, hsa-miR-625, and hsa-miR-409 were linked to childhood asthma^28^. Finally, hsa-miR-370 was demonstrated to be associated with asthma in adults^29^.
Furthermore, we performed extensive colocalization analyses based on the cis-miRNA-QTL statistics and external summary statistics from eQTL, sQTL, and GWAS analyses. The selected eQTL, sQTL, and GWAS summary statistics correspond to asthma-relevant tissues, cells, and traits. We incorporated sQTL associations since RNA splicing potentially plays an important role in the connection between genetic variation and complex diseases^30^. The results showed that cis-miRNA-QTLs overlap with cis-eQTLs and cis-sQTLs of genes, of which multiple were implicated in asthma. These genes include FADS1, FADS2, FADS3, TMEM258, UCKL1, PPP2R5C, TPD52L2, MEG3, and NHEJ1.
FADS1 and FADS2 were linked to asthma in several previous studies^31–34^. For example, Huang et al. reported that an association between polyunsaturated fatty acids and asthma risk was linked to FADS1 polymorphisms^33^. Moreover, Zaied et al. implicated FADS1, FADS2, and TMEM258 in the context of the metabolism of arachidonic acid and overexpression of cysteinyl leukotrienes in patients with asthma^35^. Zhu et al. linked genetic variants in this locus to asthma and fasting glucose^36^. Zhu et al. showed that DNA methylation linked to UCKL1 predicts asthma severity in African American children^37^. Whittle et al. characterized the molecular structure of human dust-mite-associated allergic asthma. The Mitotic Roles of the Polo-Like Kinase pathway was predicted to have significantly altered activity in asthma patients. One of the molecules with decreased gene expression was PPP2R5C^38^. In another study, seropositivity to Ascaris lumbricoides was associated with a CpG site linked to TPD52L2 in men^39^. Chiu et al. detected an association between SNPs in MEG3, asthma status and severity, and MEG3 expression^40^. Furthermore, Feng, Yang, and Yan identified differential MEG3 expression in peripheral blood between individuals with asthma and those without asthma^41^.
The colocalization analysis with the GWAS associations identified strong colocalizations with susceptibility loci for neutrophil, platelet, and eosinophil counts, as well as asthma affection status. These overlaps included hsa-miR-181a-5p, hsa-1908-5p, hsa-miR-100-5p, hsa-miR-625-3p, and the miRNAs located in the 14q32 cluster. Interestingly, this suggests a link between the genetic regulation of miRNAs and blood cell traits, though further analyses are required to disentangle causal relationships. Previous studies established a connection between blood eosinophil counts and asthma-related outcomes^42^.
Our analysis has the unique strength that we assessed the genetic regulation of miRNA expression directly in well-phenotyped children with asthma from populations with high prevalence. Furthermore, GACRS and CAMP collected data based on analogous protocols, and the WGS, as well as miRNA data, were processed using the same procedures.
On the other hand, our study has the following limitations. First, the analyses are based on relatively small sample sizes, limiting both the power of the discovery in GACRS and the replication study in CAMP. Second, although previous work demonstrated the relevance of circulating miRNAs obtained from serum samples in the study of asthma etiology^43^, the functional significance of our findings needs to be assessed in more detail in future studies. This also applies to the hypothesized relevance of the reported novel cis-miRNA-QTL associations in children with asthma based on the absence of these associations in other studies. Since our analyses only include children with asthma, we cannot directly compare these associations in patients with asthma and unaffected individuals. Next, although multiple miRNAs are located on the X-chromosome, our QTL study did not perform a cis-miRNA-QTL analysis for these miRNAs, leaving it as a part of future steps. Moreover, our quality control excluded a small number of genetic variants with highly significant deviations from the Hardy-Weinberg Equilibrium, although these deviations could arise from the multi-ancestry characteristics of the cohorts, especially CAMP, rather than being due to technical artifacts. Additionally, our replication analysis in the three external studies is partly based on nominally significant external summary data only, and does not consider the direction of effect. Lastly, we note that the top-cis-miRNA-QTL analysis did not investigate the existence of independent secondary QTLs in the corresponding region, making the colocalization results sensitive regarding the selected size of the genetic region and potentially reflecting sharing of secondary QTL instead of the primary hit identified in our discovery analysis.
In summary, we characterized genetic determinants of serum miRNA levels in children with asthma using whole-genome sequencing data, discovering novel miRNA-QTL associations that are potentially specific or more common in children with asthma. The underlying genetic factors were linked to asthma-relevant gene expression and splicing regulation, as well as blood cell traits and asthma status GWAS results. This work highlights the connection between genetic factors, miRNAs, and asthma-relevant gene expressions and phenotypes. Future studies will aim to dissect these associations to unravel causal pathways.
Methods
Study populations
The miRNA-QTL discovery analysis utilized data from the “Genetic Epidemiology of Asthma in Costa Rica” Study (GACRS). Replication was performed using data from the Childhood Asthma Management Program (CAMP).
GACRS is a cross-sectional cohort of children aged 6 to 14 years of age with asthma (defined as physician-diagnosed asthma and at least two episodes of respiratory symptoms (wheezing, cough, or dyspnea), or a history of asthma attacks in the previous year) and at least six great-grandparents born in the Central Valley of Costa Rica. Recruitment and detailed study descriptions have been discussed in a previous publication^16^.
CAMP was a multi-center, randomized, double-blind clinical trial designed to determine the long-term effects of three inhaled treatments for childhood asthma^17^. CAMP included 1041 ethnically diverse children aged 5 to 12 years old with mild to moderate persistent asthma, as defined by the presence of symptoms or by the use of an inhaled bronchodilator at least twice weekly or the use of daily medication for asthma^44^.
GACRS was approved by the Mass General Brigham Human Research Committee at Brigham and Women’s Hospital (Boston, MA; protocol No. 2000P001130) and the Hospital Nacional de Niños (San José, Costa Rica).
For CAMP, all study procedures were approved by the Mass General Brigham Human Research Committee at Brigham and Women’s Hospital (Boston, MA; protocol No. 2011P000710).
The current work was approved by the Mass General Brigham Human Research Committee at Brigham and Women’s Hospital (Boston, MA; protocol No. 2017P001799). Written parental consent and participating child’s assent were obtained. The authors confirm that the study conforms to recognized standards of the Declaration of Helsinki for medical research involving human subjects. Both studies are registered at https://clinicaltrials.gov/ (NCT00021840: Genetic Epidemiology of Asthma in Costa Rica (first submission 2005-06-23) and NCT00000575: Childhood Asthma Management Program (CAMP) Phases I (Trial), II (CAMPCS), III (CAMPCS/2), and IV (CAMPCS/3) (first submission 2005-06-23)).
Whole-genome sequencing
Whole-genome sequencing (WGS) data for GACRS and CAMP were generated as part of the National Heart, Lung, and Blood Institute (NHLBI) TOPMed program^45,46^. Details on DNA sample handling, quality control, library construction, clustering and sequencing, read processing, and sequence data quality control are described on the TOPMed website. Our analyses are based on the Freeze 10 release. We extracted bi-allelic autosomal single nucleotide polymorphism (SNPs) and insertions/deletions with a minimum depth of coverage of 10 reads that passed variant quality control (GRCh38 genome reference). We identified a set of unrelated offspring with asthma for the miRNA-QTL analyses using kinship-based inference for GWAS (KING)^47^ and removed sex mismatches and samples with missing rates >=5%. Finally, we excluded genetic variants with a minor allele frequency (MAF) below 5%, Hardy-Weinberg deviation (p < 10^-8), or a missing rate >=2%.
MicroRNA data
Small RNA sequencing (RNA-seq) was performed on serum samples from GACRS and CAMP samples. The details and protocols for this sequencing were described previously^48^, and data was processed in GACRS and CAMP separately. Small RNA-seq libraries were generated based on the Norgen Biotek Small RNA Library Prep Kit (Norgen Biotek, Therold, Canada) and sequenced using the Illumina NextSeq 500 platform by Norgen Biotek. RNA-seq data quality control was performed using the ExceRpt pipeline^49^. MicroRNAs (miRNAs) with less than five mapped reads in 50% of the subjects were removed. Finally, normalization was performed using DESeq2^50^. We utilized miRBase data (version v22)^51–56^ to map miRNAs to (multiple) GRCh38 genomic positions (see Data Availability).
Cis- and trans-miRNA-QTL analysis
The miRNA-QTL analyses were performed using tensorQTL^18^ and included both cis-(locally) and trans-(at distance)analyses. Discovery analysis was performed using GACRS data, and the results were tested for replication in CAMP data.
Cis-miRNA-QTL analysis
The window size parameter to define cis-associations in the tensorQTL cis-analysis was set to the default value of 1 Mb. The cis-analysis performed by tensorQTL identifies the genetic variant with the strongest association with the miRNA expression within the specified genetic cis-region and computes a maximum test statistic and the corresponding p-value based on a beta distribution approximation inferred from a permutation distribution. We will refer to this selected genetic variant as the top-cis-miRNA-QTL. The internal empirical p-value computations used 10,000 permutations. We included the following covariates in the model: age, sex, batch (in GACRS), and the top five principal components of genetic ancestry (gPCs). Additional sensitivity analyses included the top five or top ten principal components (PCs) of the quality-controlled and normalized miRNA expression count matrix derived from the Reference-Free Adjustment for Cell-Type composition (ReFACTor) approach as covariates^20^ (adopted code, see Code Availability). These additional covariates address the potential presence of unmeasured technical noise or the residual influence of cell type-specific expression profiles across samples with varying cell type proportions. Moreover, we also considered an alternative factor adjustment using the top five and top ten factors from a non-negative matrix factorization of the count matrix. For the non-negative matrix factorization, we utilized the NMF R package (see Code Availability). The gPCs were computed from the Genetic Relationship Matrix (via PLINK2) based on genome-wide common variation data for all samples included in the analysis, separately in GACRS and CAMP. Here, the genome-wide genetic variation data were pruned for linkage-disequilibrium (LD) using PLINK2 (command ‘indep-pairwise’ with parameters ‘500 50 0.01’). Significant top-cis-miRNA-QTLs were identified using a False Discovery Rate q < =0.05 (qvalue R package) computed based on the top-cis-miRNA-QTL p-values across all incorporated miRNA measurements. The replication analysis using CAMP data extracted the variant-miRNA pairs corresponding to the significant top-cis-miRNA-QTLs identified in GACRS and corrected the corresponding nominal association p-values in CAMP for the number of tested pairs using a Bonferroni correction. In cases where the top-miRNA-QTL was unavailable in CAMP data, we considered the smallest association p-value among strong LD proxies, as identified in GACRS (r^2^ > =0.8 in GACRS). Additionally, sensitivity analyses for identified cis-miRNA-QTL associations investigated stratified age analyses (<= 9 years of age vs. > 9 years of age) and ancestry-specific effects by testing interactions between the QTL and gPC1. Finally, we performed a meta-analysis between GACRS and CAMP across all overlapping miRNAs for the cis-miRNA-QTL analysis based on the estimated slopes and corresponding standard errors.
Trans-miRNA-QTL analysis
The trans-miRNA-QTL analyses using tensorQTL incorporated the same set of covariates as the cis-analyses. The significance of trans-miRNA-QTL associations was evaluated based on a Bonferroni correction for the number of miRNA measurements tested in GACRS in combination with the genome-wide significance level of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$5* {10}^{-8}$$\end{document} , that is \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p < \frac{5* {10}^{-8}}{{\#miRNA\; measurements}}$$\end{document} . Significant variant-miRNA pairs were tested for replication in CAMP using the same strategy described for the cis-analysis.
Replication using results from previously published miRNA-QTL analyses
Nikpay et al. conducted a genome-wide miRNA-QTL analysis of 2,083 miRNAs using plasma samples taken from 710 unrelated subjects of European genetic ancestry^14^. We downloaded the corresponding genome-wide summary statistics (see Data Availability) and tested the cis-miRNA-QTL associations detected in our GACRS analysis for replication. In cases where the top-miRNA-QTL was not available in the data by Nikpay et al., we considered strong LD proxies (criteria r^2^ > =0.8 in GACRS). In addition to evaluating the single variant association data, we performed a colocalization analysis between GACRS and the data by Nikpay et al. for the cis-miRNA-QTL hits. We defined a successful replication by a replication p-value that is nominally significant after adjusting for the number of associations tested or a colocalization posterior probability of more than 80%.
We also incorporated the cis-miRNA-QTL results by Huan et al., based on whole blood in 5239 individuals from the Framingham Heart Study^15^. Here, we checked for overlap with our cis-miRNA-QTL findings using the significant results provided by Huan et al. (cis-miRNA-QTL with FDR < 0.1, Supplementary Table 4 in^15^) (see Data Availability), incorporating LD proxies if needed, as described above. A successful replication was defined by a replication p-value that is nominally significant after adjusting for the number of associations tested.
Additionally, we included the recent results from genome-wide association studies of 2083 plasma circulating miRNAs in 2178 participants of the Rotterdam Study^22^. Mustafa et al. provided all nominally significant (p < =0.05) miRNA-QTL associations. Here, we used the same approach as for the Huan et al. data.
Colocalization analysis
We performed colocalization analyses to investigate if a) cis-miRNA-QTLs are shared between miRNAs in proximity (GACRS data), b) miRNA-QTL signals are consistent between GACRS and CAMP, c) miRNA-QTLs observed in GACRS are overlapping with previous miRNA-QTL analyses, d) cis-miRNA-QTLs overlap with expression quantitative trait loci (eQTLs) or splicing quantitative trait loci (sQTLs) in asthma-relevant tissues and cells, and e) cis-miRNA-QTLs colocalize with GWAS hits for asthma-related traits. The colocalization analyses were performed using HyPrColoc (hyprcoloc R package)^57^, applying default parameters and requiring at least ten overlapping genetic variants. The eQTL and sQTL summary statistics were obtained from the RNA sequencing results (Freeze 1.1) of the TOPMed initiative (see Data Availability). These cis-eQTL and cis-sQTL summary statistics cover nasal epithelial cells, whole blood, lung, monocytes, peripheral blood mononuclear cells (PBMC), and T-cells. GWAS summary statistics were extracted from the analysis of childhood-onset and adulthood-onset asthma^6^, a global meta-analysis of asthma^5^, the meta-analysis of lung function measurements by Shrine et al.^24^, and the UK Biobank-based analysis of 28 blood cell traits by Vuckovic et al.^23^. The latter includes, for example, the analysis of eosinophil counts, lymphocyte counts, and neutrophil counts. Previous studies have shown that blood eosinophil counts and asthma outcomes are related^42^. We declared a detected colocalization using a cutoff of 80% for the colocalization posterior probability. We note that the HyPrColoc analyses between miRNAs within GACRS data face a complete sample overlap.
Supplementary information
Supplementary Information Supplementary information
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 12022 GINA Main Report. Global Initiative for Asthma - GIN Ahttps://ginasthma.org/gina-reports/.
- 2Sharma R, et al. Circulating Micro RN As associated with Bronchodilator Response in Childhood Asthma. Res Sq [Preprint]. 10.21203/rs.3.rs-3101724/v 1 (2023).10.1186/s 12890-024-03372-4PMC 1153689839497092 · doi ↗ · pubmed ↗
- 3Hough, K. P. et al. Airway remodeling in asthma. Front. Med. 7 (2020).10.3389/fmed.2020.00191 PMC 725366932509793 · doi ↗ · pubmed ↗
- 4The Childhood Asthma Management Program (CAMP): Design, Rationale, and Methods. Control. Clin. Trials 20, 91–120 (1999).10027502 · pubmed ↗
- 5Denisenko, Y., Novgorodtseva, T. & Antonyuk, M. The activity of fatty acids desaturases in mild and moderate asthma. Eur. Respir. J. 56 (2020).
- 6Chiu, K.-L. et al. MEG 3 SNP variants are associated with the risk of asthma. Eur. Respir. J. 62 (2023).
- 7Long-term effects of Budesonide or Nedocromil in children with asthma. N. Engl. J. Med. 10 (2000).10.1056/NEJM 20001012343150111027739 · doi ↗ · pubmed ↗
