Genetic crossroads of cardiovascular disease and its comorbidities: toward holistic therapeutic strategies

Binisha H. Mishra; Pashupati P. Mishra

PMC · DOI:10.1186/s12920-025-02279-1·November 26, 2025

Genetic crossroads of cardiovascular disease and its comorbidities: toward holistic therapeutic strategies

Binisha H. Mishra, Pashupati P. Mishra

PDF

Open Access

TL;DR

This study explores shared genetic factors between cardiovascular disease and its comorbidities to support holistic treatment strategies.

Contribution

The study identifies novel genetic loci and genes linking CVD with comorbidities using a multivariate GWAS approach.

Findings

01

Four CVDs and seven comorbidities showed significant genetic correlations.

02

A multivariate GWAS identified 141 novel loci and 16 protein-coding genes linked to shared pathogenesis.

03

The findings suggest a shared genetic architecture between CVD and its comorbidities.

Abstract

With increasing life expectancy, the prevalence of cardiovascular disease (CVD) accompanied by comorbidities is rising, presenting a growing challenge for healthcare systems. Understanding shared genetic factors underlying CVD and its comorbid conditions may facilitate the development of more effective strategies for prevention and treatment. In this study, we investigated genetic correlations between CVD and common comorbidities using genome-wide association study (GWAS) summary statistics from the FinnGen R12 release, comprising data from 500,000 Finnish individuals. Following standard quality control procedures, we examined 19 disease endpoints using linkage disequilibrium score regression (LDSC) to estimate heritability and pairwise genetic correlations. Disease traits with significant heritability (z-score ≥ 4) and Bonferroni-corrected significant correlations (adjusted p < 0.05)…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Genes5

NPC1 TMEM106B PTPN22 MAP2K5 MSRA

Diseases11

cardiovascular disease transient ischemic attack atrial fibrillation myocardial infarction heart failure type 2 diabetes asthma obesity depression chronic obstructive pulmonary disease gingivitis

Figures2

Click any figure to enlarge with its caption.

Genetic correlations and multivariate genetic factor model of cardiometabolic diseases. A SNP-based pairwise genetic correlations for the nineteen cardiometabolic disorders estimated using Linkage Disequilibrium Score (LDSC) regression. Trait names highlighted in orange on the x- and y-axes represent the subset of 11 traits selected for further analysis, based on statistically significant pairwise genetic correlations (Bonferroni-adjusted *p*-value < 0.05) among each other. Three exceptions are noted: gingivitis was not significantly correlated with heart failure and myocardial infarction, chr

Manhattan plot of the multivariate genome-wide association study (GWAS) of the latent shared genetic factor (LGF) underlying cardiovascular diseases and their comorbidities. The *x*-axis shows genomic coordinates by chromosome, and the *y*-axis shows the − *log*_10_ of the unadjusted *p*-values for SNP associations with linkage disequilibrium score (LDSC) regression. Highlighted points indicate the 29 independent, novel SNPs identified through multivariate GWAS in this study. Associations were derived using GenomicSEM, leveraging linkage disequilibrium score (LDSC) regressionTable 2List of 29

Funding1

—Tampere University (including Tampere University Hospital)

Keywords

Cardiovascular diseasesComorbiditiesGenetic correlationMultivariate GWASGenomic SEMPleiotropyFinnGen

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenetic Associations and Epidemiology · Bioinformatics and Genomic Networks · Genomics and Rare Diseases

Full text

Background

Cardiovascular diseases (CVDs) remain one of the most urgent global public health challenges, with significant implications for both human well-being and economic stability. In the European Union (EU), CVDs are the leading cause of mortality, accounting for approximately 32.4% of all deaths [1]. The economic burden is equally alarming, with an estimated annual cost of around €282 billion, driven by healthcare expenses, lost productivity, and informal care [2]. Despite global initiatives aimed at reducing their impact, progress in mitigating the burden of CVDs has been inadequate. At the same time, the prevalence of comorbid chronic conditions alongside CVDs continues to rise. This trend has led the European Commission to recognize multimorbidity, including comorbidities associated with CVD, as a key health priority, promoting awareness and fostering innovation in healthcare solutions to better address the growing complexities of managing multiple chronic conditions [3].

There is a growing consensus among public health experts and policymakers that healthcare systems must shift away from a narrow, single-disease focus toward a more holistic, multimorbidity-oriented approach. Effective management of complex cases involving CVD and its comorbidities requires a deeper understanding of how chronic diseases interact, how they jointly influence health outcomes, and how they affect treatment effectiveness and healthcare utilization. Managing CVD as a standalone condition often neglects the roles of co-occurring conditions such as diabetes, obesity, and depression, which can complicate care and reduce treatment efficacy.

It is crucial to move beyond clinical observations and investigate the underlying biological mechanisms that drive the co-occurrence of these conditions to develop more effective and targeted strategies. While previous studies have explored comorbid relationships between atherosclerosis and individual conditions such as osteoporosis [4], depression [5], lung disease [6], asthma [7], obesity and diabetes [8], comprehensive investigations that simultaneously consider multiple CVDs and their common comorbidities remain rare. One promising avenue for advancing this understanding is through the exploration of shared genetic architecture, which can reveal underlying biological connections between seemingly distinct CVD and its comorbidities.

A recent study by [9] investigated a similar multimorbidity concept using a genome-wide association study (GWAS) approach on cardiometabolic multimorbidity, defined as the co-occurrence of two or three cardiometabolic diseases. However, their study relied on traditional GWAS methods and did not analyze shared genetic architecture of the CVDs and their comorbidities using multivariate approaches. In contrast, the present study aims to assess the genetic correlations between CVD and its most prevalent comorbidities, identify shared genetic factors contributing to their co-occurrence and identify genetic variants associated with this common underlying genetic foundation utilizing advanced statistical methodologies. A deeper understanding of these shared genetic underpinnings of CVD and its comorbidities may uncover common biological pathways and guide the development of more integrated and effective shared strategies for their prevention, treatment, and management.

Material and methods

This study was based on GWAS summary statistics of 19 clinical endpoints related to CVDs and their comorbidities from FinnGen data freeze 12 [10]. The original GWAS analyses were adjusted for key covariates, including sex, age, the first 10 principal components of genetic ancestry, genotyping chip version (FinnGen chip version 1 or 2), and genotyping batch. Clinical endpoints were defined using the Finnish adaptation of the International Classification of Diseases, 10th revision (ICD-10), as implemented in national health registers. Historical records using ICD-8 and ICD-9 were harmonized to ICD-10 definitions to ensure consistency across the full-time span of the registry data [10]. Full definitions are provided in Supplementary Table 1. The FinnGen study is a large-scale genomics initiative that has analyzed over 500,000 Finnish biobank samples and correlated genetic variation with health data to understand disease mechanisms and predispositions. The project is a collaboration between research organizations and biobanks within Finland and international industry partners. The number of cases and controls for each disease, along with the effective sample size (Neff), calculated as 4 times the product of the number of cases and controls, divided by their sum, i.e., Neff = 4 × (cases × controls)/(cases + controls), to account for case–control imbalance, are presented in Table 1 [11]. Selection of GWASs on CVDs and their comorbidities was based on the comorbidities identified in [12], a comprehensive population-based study that systematically characterized major CVD comorbidities. Only those comorbidities that were available in FinnGen database were included. For classification purposes, we followed standard clinical and epidemiological definitions. Hypertension, although a major risk factor for CVD, is not classified as a CVD per se and was therefore categorized as a non-CVD comorbidity in our study. Similarly, pulmonary embolism, as a component of venous thromboembolism, primarily involves the venous rather than the arterial or cardiac systems and is typically considered distinct from classic CVDs. This classification was used to better capture shared genetic architecture between core CVD phenotypes and related but biologically or clinically distinct comorbidities. Since the same underlying cohort was used for all GWASs, the proportion of overlapping individuals between trait pairs varied from 7–76% depending on the phenotype definitions (Supplementary Table 2). Although the GWASs were derived from the same FinnGen cohort, differences in case definitions, exclusion criteria, and control group composition led to varying degrees of sample overlap between traits. This sample overlap was accounted for in the GenomicSEM modeling.Table 1. Cardiovascular diseases and their associated comorbidities included in this study, with number of cases, controls, effective sample size and genome-wide significant loci identified by univariate genome-wide association studies (GWASs)DiseasesCasesControlsEffective sample sizeSignificant lociCardiovascular comorbidities1Coronary atherosclerosis63307416171219793.51542Transient ischemic attack2494845327694586.0523Stroke34110450023126827104Atrial fibrillation63532252810203090.61775Myocardial infarction31666416171117707.7846Heart failure37653462695139277.9187Cardiomyopathy764937534329984.9547Non-cardiovascular comorbidities8Type 2 diabetes55418405261195005.72729Hypertension154630345634427337.442410Asthma47300259839160062.88711Metabolic dysfunction associated fatty liver disease350449684413917.84712Osteoporosis1046147326440939.081313Obesity31499468693118061.67614Depression59333434831208836.24515Chronic obstructive pulmonary disease2413840907091172.23516Epilepsy1564537460560071.19417Pulmonary embolism1276248631949742.653318Rheumatoid arthritis1631431511562043.894519Gingivitis137839310260381754.32

Quality control steps in the GWAS involved exclusion of participants with ambiguous gender, genotype missingness > 5%, excess heterozygosity (+ −4SD) and non-Finnish ancestry. SNPs with missingness > 2%, low HWE p-value < 1e-6 and minor allele count MAC < 3 were excluded [10].

Heritability for each disease and the genetic correlations between them were assessed using linkage disequilibrium score (LDSC) regression [11]. This study included only diseases with a heritability z-score ≥ 4, given that genetic correlation estimates below this threshold are typically characterized by high noise and limited reliability [13]. Genetic correlations with Bonferroni-adjusted p-values (adj.pval) < 0.05 were considered statistically significant.

A latent genetic factor (LGF) was constructed using genomic structural equation modeling (genomic SEM), based on the set of genetically correlated traits by leveraging the genetic covariance structure between them [14]. A multivariate GWAS of LGF was conducted using Genomic SEM to find shared genetic architecture of cardiovascular diseases and their comorbidities. Independent loci were identified using PLINK 2.0’s clumping procedure, with LD (linkage disequilibrium) threshold of r^2^ = 0.6 and a physical distance threshold of 1000 kilobases for clump formation [15]. The independent loci were annotated using SNPnexus, a web-based tool for genetic variant annotation using data from public repositories [16].

Results

Genetic correlation between CVDs and their comorbidities

Out of 19 clinical endpoints related to CVD and their comorbidities, we considered genetic correlations with an adjusted p-value < 0.05 as statistically significant (Fig. 1A, Supplementary Table 3). In addition, we reported selected endpoints that did not reach this threshold, due to their broader relevance across multiple conditions. For instance, although gingivitis was not significantly correlated with heart failure and myocardial infarction, it showed significant genetic correlations with nine other conditions. Similarly, chronic obstructive pulmonary disease was not correlated with transient ischemic attack but was significantly correlated with all other endpoints. Depression was not correlated with heart failure but was significantly correlated with the remaining studied conditions. These findings indicate that, despite the lack of statistical significance for a few individual endpoints, these diseases remain important due to their consistent genetic correlations across most of the clinical outcomes. As a result, we selected 11 of the 19 diseases for further analysis of potential shared genetic etiology. These comprised four CVDs (heart failure, myocardial infarction, arterial fibrillation and transient ischemic attack) and seven comorbidities (type 2 diabetes, gingivitis, chronic obstructive pulmonary disease, depression, obesity, asthma and hypertension). Overall, the findings support a shared genetic foundation and etiology underlying these clinically distinct conditions.Fig. 1. Genetic correlations and multivariate genetic factor model of cardiometabolic diseases. A SNP-based pairwise genetic correlations for the nineteen cardiometabolic disorders estimated using Linkage Disequilibrium Score (LDSC) regression. Trait names highlighted in orange on the x- and y-axes represent the subset of 11 traits selected for further analysis, based on statistically significant pairwise genetic correlations (Bonferroni-adjusted p-value < 0.05) among each other. Three exceptions are noted: gingivitis was not significantly correlated with heart failure and myocardial infarction, chronic obstructive pulmonary disease (COPD) was not significantly correlated with transient ischemic attack (TIA), and depression was not associated with heart failure. B Path diagram with standardized factor loadings in the hierarchical model estimated using Genomic SEM. The μ represents the residual variance that is not explained by the latent factor. Abbreviations: Pulmo-embolism, pulmonary embolism; MI, myocardial infarction; AF, atrial fibrillation and flutter; Coronary-athero, coronary atherosclerosis; MASLD, metabolic dysfunction-associated steatotic liver disease; T2D, type 2 diabetes mellitus; LGF, latent genetic factor, a surrogate variable for shared genetic foundation for these clinically distinct diseases

Shared genetic architecture of CVDs and their comorbidities

Using a multivariate GWAS of LGF underlying 11 genetically correlated CVDs and the comorbid conditions (Fig. 1B), we identified a total of 147 SNPs reaching genome-wide significance (p-value < 5 × 10^–8^). The LGF model showed a moderate fit, with a comparative fit index (CFI) of 0.83 and a standardized root mean square residual (SRMR) of 0.12. These fit indices indicate that the common factor captures a substantial portion of the shared genetic variance among the 11 traits, although some residual heterogeneity remains, suggesting that more complex factor structures may better represent the genetic architecture.

Of the 147 SNPs, 141 were not previously reported as genome-wide significant in any of the univariate GWASs for the 11 traits (Table 1). Of the 141 SNPs, 29 were identified as independent loci through PLINK-based clumping (Fig. 2) (Table 2). The 6 previously reported SNPs (rs6741951, rs9856151, rs2397049, rs7831557, rs876954, rs1788817), which reached significance in the multivariate model and showed associations across multiple traits mapped to genes MSRA, NPC1, CPNE4, and RMC1.Fig. 2. Manhattan plot of the multivariate genome-wide association study (GWAS) of the latent shared genetic factor (LGF) underlying cardiovascular diseases and their comorbidities. The x-axis shows genomic coordinates by chromosome, and the y-axis shows the − log10 of the unadjusted p-values for SNP associations with linkage disequilibrium score (LDSC) regression. Highlighted points indicate the 29 independent, novel SNPs identified through multivariate GWAS in this study. Associations were derived using GenomicSEM, leveraging linkage disequilibrium score (LDSC) regressionTable 2List of 29 independent novel genetic variants with genomic location, p-values, and annotation, identified via a multivariate genome-wide association study (GWAS) of a latent genetic factor (LGF) representing genetically correlated cardiovascular diseases (CVDs) and their comorbidities. Also shown are the lowest p-values for each SNP across all individual CVD traits and across all comorbid traits from univariate GWASrsIDChromosome:positionP-valueLowest p-value across all CVD traitsLowest p-value across all comorbid traitsGeneType**Gene ontology – biological processrs1076773411:286208344.2e-146.8e-052.7e-11NoneNoneNoners69844968:109385832.5e-132.9e-035.8e-17XKR6Protein codingApoptotic processrs163168518:235542753.5e-121.4e-034.2e-12NPC1Protein codingCholesterol binding/signaling receptor activityrs995527618:18393384.2e-111.3e-041.9e-12NoneNoneNoners178882618:235740604.4e-119.7e-041.0e-09NPC1Protein codingCholesterol binding/signaling receptor activityrs93733636:1428289064.4e-111.4e-052.0e-12HIVEP2Protein codingRegulation of transcriptionrs69669157:122263621.4e-101.1e-038.0e-10TMEM106BProtein codingDendrite morphogenesis/Lysosomal organizationrs9872376:508353372.1e-103.1e-043.1e-19TFAP2BProtein codingGlucose metabolic process/Regulation of transcriptionrs10455298:90325883.7e-104.9e-031.2e-12ERI1Protein codingRNA metabolismrs99729515:677240054.1e-105.3e-033.1e-13MAP2K5Protein codingMAPK signaling cascaders24766011:1138349464.9e-101.1e-057.0e-12PTPN22Protein codingImmune systemrs78375878:85214825.4e-102.4e-021.5e-14NoneNoneNoners48410058:86437206.7e-101.0e-023.6e-13NoneNoneNoners76427493:356273977.5e-102.7e-031.9e-08NoneNoneNoners800066613:1077122568.1e-106.9e-023.6e-02FAM155AProtein codingCalcium channel activityrs170497122:587340018.8e-101.4e-012.9e-12LINC01122lincRNANoners78379798:103410249.8e-101.4e-034.0e-12MSRAProtein codingMethionine metabolic process/Response to oxidative stressrs43824808:88639631.5e-093.1e-043.5e-12MFHAS1Protein codingInflammatory responsers47119666:509960641.5e-097.8e-066.3e-10NoneNoneNoners1691077012:153527971.8e-091.4e-039.4e-11PTPROProtein codingRegulation of glomerular filtrationrs70053638:104262381.9e-093.0e-033.3e-16MSRAProtein codingMethionine metabolic process/Response to oxidative stressrs100865218:109262592.1e-091.8e-033.7e-09XKR6Protein codingApoptotic processrs48413188:103296902.1e-091.4e-023.4e-12MSRAProtein codingMethionine metabolic process/Response to oxidative stressrs130639793:1318458972.2e-093.3e-048.9e-09CPNE4Protein codingCellular response to calcium ionrs74289183:187828062.3e-092.1e-041.3e-09SATB1-AS1Processed transcriptNoners606822220:525418023.1e-095.4e-033.8e-09LINC01524lincRNANoners1258902814:690760253.2e-091.8e-031.1e-08DCAF5Protein codingProtein ubiquitination/Fatty acid biosynthetic processrs87100411:284909114.0e-093.1e-032.2e-14METTL15Protein codingMethylationrs69938418:107751744.2e-091.2e-032.3e-12PINX1Protein codingTelomere maintenance

To aid interpretation of multivariate results, for each SNP, we report the lowest p-value among CVD traits and the lowest p-value among comorbid traits in Table 2. We report the marginal associations (β, SE, and p-values) of all 29 genome-wide significant SNPs across the 11 input traits in Supplementary Table 4. This table also includes the heterogeneity Q-values, which indicate whether the SNP effects are consistent across traits or driven predominantly by one or a few. Additionally, Supplementary Table 5 presents the standardized factor loadings, standard errors, and p-values for each trait from the LGF model. This allows differentiation between SNPs likely to be truly pleiotropic and those predominantly driven by one trait domain. Notably, 10 SNPs exhibited Q p-values < 0.001, indicating that their effects were primarily driven by a subset of traits, whereas the remaining loci showed more consistent effects across multiple traits. These patterns highlight the importance of considering multivariate significance alongside marginal associations and Q_SNP statistics.

These 29 SNPs mapped to 16 protein-coding genes involved in several key biological processes associated with CVDs and their comorbidities, including NPC1, TMEM106B, PTPN22, MAP2K5, and MSRA**,** which are further discussed for their pleiotropic effects and potential therapeutic relevance. Importantly, several of the 29 independent SNPs had only modest or nominal associations in individual univariate GWAS, illustrating how the multivariate LGF approach captures variants contributing to shared genetic liability that may be missed when examining traits individually. Functional annotation of these loci revealed that these included fundamental cellular functions such as metal ion binding, methyltransferase activity, and protein metabolism; key metabolic processes like the regulation of fatty acid biosynthesis and LDL clearance; as well as pathways involved in calcium ion import across the plasma membrane, signal transduction, immune system function, dendrite morphogenesis, lysosomal trafficking and apoptosis (Table 2). Collectively, these results provide evidence for a shared genetic architecture underpinning clinically distinct yet genetically correlated cardiovascular and comorbid conditions.

Discussion

In this study, we utilized multivariate genomic approaches to identify novel genetic loci associated with a LGF underlying 11 genetically correlated CVDs and related comorbidities, using GWAS summary statistics from the wide FinnGen database. These 11 conditions were selected from an initial set of 19 based on significant genetic correlations, reflecting a shared genetic architecture. These findings suggest that the co-occurrence of CVDs and their comorbidities may stem from overlapping pleiotropic genetic mechanisms, rather than solely from shared environmental or behavioral risk factors. Through multivariate GWAS of the LGF, our analysis uncovered genetic loci not previously implicated in individual GWASs of CVDs, underscoring the utility of modeling shared genetic foundation and liability. These findings highlight pleiotropic mechanisms that may contribute to both cardiovascular pathology and its related comorbidities, thereby offering promising avenues for future genetic, biological, and therapeutic research.

To further support the interpretation of these multivariate GWAS results, we examined the marginal associations of all 29 genome-wide significant SNPs across the 11 included traits (Supplementary Table 4). This allowed us to assess whether these SNPs reflect truly pleiotropic effects or are predominantly driven by associations with specific subsets of traits (for example, only CVDs or only comorbidities). We also report heterogeneity Q-values derived from the GenomicSEM framework, which help quantify the extent to which SNP effects diverge across traits. Additionally, the standardized factor loadings, standard errors, and p-values for the 11 traits on the latent genetic factor are presented in Supplementary Table 5. These loadings ranged from 0.36 to 0.79, with 7 of the 11 traits loading ≥ 0.5, suggesting that the latent factor captures a robust and broadly shared genetic architecture spanning both cardiovascular and comorbid conditions.

While all 29 SNPs identified in the multivariate GWAS reached genome-wide significance for the LGF and are therefore relevant to the shared genetic architecture of CVD and comorbid traits, inspection of marginal associations and Q_SNP heterogeneity statistics provides critical context for interpretation. Multivariate significance does not require strong univariate effects in every trait; rather, GenomicSEM leverages cross-trait genetic covariance to detect variants that might be missed in univariate analyses. By reporting the lowest p-values across CVDs and comorbid traits (Table 2) alongside heterogeneity measures, we distinguish loci that are likely pleiotropic across both domains from those predominantly driven by one set of traits.

Functional annotation revealed that many of the identified genes are involved in biological processes central to cardiovascular and metabolic health, including lipid metabolism, vascular integrity, mitochondrial function, inflammation, and oxidative stress. Among the most compelling findings is the association of a SNP within NPC1 (Niemann-Pick disease, type C1), a gene essential for intracellular cholesterol transport and lipid homeostasis [17]. Dysregulation of NPC1 has been implicated in atherosclerosis [18], and its association with the LGF in our study reinforces its role in lipid trafficking as a shared pathogenic mechanism across genetically correlated CVDs. NPC1 showed strong multivariate significance, suggestive associations with CVD traits (p-value ≈ 1e-03), and strong associations with comorbid trait (p-value ≤ 1.0e-09), indicating that its LGF contribution is largely comorbid trait-driven, consistent with its known role in cardiometabolic disease.

MSRA (methionine sulfoxide reductase A) was also genome-wide significant for the LGF. Known for its antioxidant role in repairing oxidized methionine residues [19], MSRA has been linked to improved lipid metabolism and atherosclerotic protection in animal models [20] and to neuropsychiatric phenotypes in human cohorts [21, 22]. In our dataset, MSRA showed modest CVD associations (lowest p-value = 1.4e-03) but stronger evidence in comorbid traits (p-value ≤ 4.0e-12), suggesting it is more comorbid trait-driven. Nonetheless, its role in oxidative stress defense and prior pleiotropic links underscore its relevance to both cardiovascular and systemic health.

For TMEM106B (Transmembrane Protein 106B), a locus previously linked to both depression and coronary artery disease [23], we observed genome-wide significance in the LGF, strong association with comorbid traits (p-value = 8.0e-10) but only marginal evidence in CVD traits (p-value = 1.1e-03). This indicates that in our dataset, the locus is primarily comorbid trait-driven rather than broadly pleiotropic, although prior evidence suggests potential cross-system mechanisms.

PTPN22 (Protein Tyrosine Phosphatase Non-Receptor Type 22), a well-known immune regulator, has been implicated in autoimmune disease and inflammation. Recent studies suggest roles in calcific aortic valve disease pathogenesis [24], highlighting therapeutic potential through modulation of immune activity. In our results, PTPN22 showed strong associations with both CVD (p-value = 1.1e-05) as well as with comorbid traits (p-value = 7.0e-12).

MAP2K5, a component of the MAPK/ERK5 signaling pathway that influences endothelial dysfunction and atherosclerosis [25, 26], reached genome-wide significance in the LGF and strong comorbid signal (p-value = 3.1e-13) but showed only nominal evidence in CVD traits (p-value = 5.3e-03). Thus, MAP2K5’s contribution appears modest and primarily comorbid trait-specific, though its pathway-level functions remain biologically relevant.

Other notable genes included XKR6 (XK-related protein 6), implicated in apoptotic processes central to atherosclerosis [27], and ERI1 (exoribonuclease 1), which may influence vascular remodeling through RNA metabolism [28]. PTPRO (protein tyrosine phosphatase receptor type O) points to the renal–cardiovascular axis via its regulation of glomerular filtration [29]. METTL15 (Methyltransferase Like 15), involved in mitochondrial RNA methylation [30], underscores the importance of mitochondrial integrity in cardiovascular disease. Genes such as HIVEP2 (human immunodeficiency virus type I enhancer binding protein) and CPNE4 (copine 4), though lacking direct CVD associations, contribute to neurodevelopmental and metabolic processes [31, 32], which may intersect with cardiometabolic risk.

To assess the external validity of our results, we cross-referenced them with prior GWAS findings. This analysis showed that 14 SNPs had been previously associated with relevant traits in independent cohorts, including type 1 and type 2 diabetes, body mass index, waist-hip ratio, insomnia, neuroticism, rheumatoid arthritis, and metabolite levels. These associations are presented in Supplementary Table 6. Although we did not aim to replicate individual SNP associations, the presence of such overlaps across diverse phenotypes supports the biological plausibility and pleiotropic nature of the identified loci.

There were certain limitations in our study. First, this study is based on GWAS summary-level data. However, the use of summary statistics from FinnGen, a large, well-characterized population cohort, allows for harmonized, high-powered analyses across multiple traits. GenomicSEM is specifically designed for such data and has been widely validated. While individual-level data would enable more flexible modeling, access to such data at comparable scale is often not feasible.

Also, effective sample size for each clinical endpoint varied, ranging from 13917.84 for MAFLD to 427337.4 for hypertension and the GWASs of the selected clinical endpoints available in FinnGen database had sample overlap ranging from 7–76%. The varying sample size and overlaps may have impacted our study findings. The moderate fit indices (CFI = 0.83, SRMR = 0.12) for the LGF model indicate that while the common factor effectively captures shared genetic influences across CVD and comorbid traits, the genetic architecture is likely more complex. Future work could explore multi-factor models or alternative structures to better characterize distinct pleiotropic patterns and trait-specific effects. Also, sex-stratified analysis was not possible due to the use of GWAS summary data as input. Therefore, further research should investigate how shared genetic risks differ by sex. Lastly, all samples were primarily of European ancestry; therefore, our findings are relevant only for European-ancestry populations.

Conclusions

In conclusion, our multivariate GWAS identified 29 loci contributing to a LGF shared across CVD and comorbid traits. While all 29 loci are valid discoveries under the GenomicSEM framework, marginal associations and Q_SNP heterogeneity reveal heterogeneity in how individual loci act. Based on marginal p-values (Table 2), the majority of loci are primarily comorbid trait**-**driven in our data. The implicated pathways converge on lipid metabolism, mitochondrial function, inflammation, apoptosis, RNA regulation and neurovascular signaling. These genes therefore remain promising candidates for follow-up, but their prioritization for therapeutic development should reflect whether they are pleiotropic or domain specific as indicated by the marginal and heterogeneity data.

Supplementary Information

Supplementary Material 1: Table S1: List of 19 diseases including cardiovascular diseasesand their comorbidities and their definitions. Table S2: Percentage of sample overlap between FinnGen genome-wide association studiesof the 19 diseases. Table S3: Statistical significanceof the genetic correlation among the 19 cardiovascular diseases and their comorbidities as estimated using Linkage disequilibrium scoreregression. Table S4: Summary of 29 SNPs identified from multitrait GWAS via GenomicSEM, including original GWAS betas, standard errors, p-values, alternate allele frequency, alternate allele frequency among cases, alternate allele frequency among controlsand Q-SNP heterogeneity statistics. Table S5: Standardized factor loadings and standard errors for individual traits in the GenomicSEM model of the genetic latent factor. Table S6: Previously reported GWAS associations for 29 independent SNPs supporting external validity of multivariate GWAS findings.

Bibliography2

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Eurostat. 2023. Available from: https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Cardiovascular_diseases_statistics.
2European Commission. Which priorities for a European Policy on multimorbidity? 2015. Available from: https://ephconference.eu/repository/sections/cd/ev_20151007_frep_en.pdf.