# Leveraging molecular-QTL co-association to predict novel disease-associated genetic loci using a graph convolutional neural network

**Authors:** Julian Ng-Kee-Kwong, Andrew D. Bretherick

PMC · DOI: 10.1371/journal.pone.0324183 · PLOS One · 2025-06-10

## TL;DR

This paper introduces a new method using a graph neural network to predict disease-related genetic loci by leveraging molecular trait associations, improving upon traditional genome-wide association studies.

## Contribution

The novel use of a graph convolutional neural network to predict disease-associated loci from molecular-QTL co-associations is introduced.

## Key findings

- A model trained on methylation QTL data successfully recapitulated over half of SNP-RNA associations from a large eQTL meta-analysis.
- The model predicted height-associated loci comparable to UK Biobank's genome-wide significant loci after Bonferroni correction.
- The model identified 143 novel disease associations in UK Biobank, with 38% successfully replicated in an independent sample.

## Abstract

Genome-wide association studies (GWAS) have successfully uncovered numerous associations between genetic variants and disease traits to date. Yet, identifying significantly associated loci remains a considerable challenge due to the concomitant multiple-testing burden of performing such analyses genome-wide. Here, we leverage the genetic associations of molecular traits – DNA CpG-site methylation status and RNA expression – to mitigate this problem. We encode their co-association across the genome using PinSage, a graph convolutional neural network-based recommender system previously deployed at Pinterest. We demonstrate, using this framework, that a model trained only on methylation quantitative trait locus (QTL) data could recapitulate over half (554,209/1,021,052) of possible SNP-RNA associations identified in a large expression QTL meta-analysis. Taking advantage of a recent ‘saturated’ map of height associations, we then show that height-associated loci predicted by a model trained on molecular-QTL data replicated comparably, following Bonferroni correction, to those that were genome-wide significant in UK Biobank (88% compared to 91%). On a set of 64 disease outcomes in UK Biobank, the same model identified 143 independent novel disease associations, with at least one additional association for 64% (41/64) of the disease outcomes examined. Excluding associations involving the MHC region, we achieve a total uplift of over 8% (128/1,548). We successfully replicated 38% (39/103) of the novel disease associations in an independent sample, with suggestive evidence for six additional associations from GWAS Catalog. Replicated associations included for instance that between rs10774625 (nearest gene: SH2B3/ATXN2) and coeliac disease, and that between rs12350420 (nearest gene: MVB12B) and glaucoma. For many GWAS, attaining such an enhancement by simply increasing sample size may be prohibitively expensive, or impossible depending on disease prevalence.

## Linked entities

- **Genes:** SH2B3 (SH2B adaptor protein 3) [NCBI Gene 10019], ATXN2 (ataxin 2) [NCBI Gene 6311], MVB12B (multivesicular body subunit 12B) [NCBI Gene 89853]
- **Diseases:** glaucoma (MONDO:0005041)

## Full-text entities

- **Genes:** ATXN2 (ataxin 2) [NCBI Gene 6311] {aka ATX2, SCA2, TNRC13}, SH2B3 (SH2B adaptor protein 3) [NCBI Gene 10019] {aka IDDM20, LNK}, HLA-C (major histocompatibility complex, class I, C) [NCBI Gene 3107] {aka D6S204, HLA-JY3, HLAC, HLC-C, MHC, PSORS1}, MVB12B (multivesicular body subunit 12B) [NCBI Gene 89853] {aka C9orf28, FAM125B}
- **Diseases:** coeliac disease (MESH:D004194), glaucoma (MESH:D005901)
- **Mutations:** rs10774625, rs12350420

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12151363/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12151363/full.md

## References

15 references — full list in the complete paper: https://tomesphere.com/paper/PMC12151363/full.md

---
Source: https://tomesphere.com/paper/PMC12151363