Graph Based Link Prediction between Human Phenotypes and Genes
Rushabh Patel, Yanhui Guo

TL;DR
This paper presents a machine learning framework that predicts links between human phenotypes and genes using graph embeddings and supervised learning, achieving high accuracy in identifying genotype-phenotype associations.
Contribution
It introduces a novel approach combining node2vec embeddings with multiple classifiers, notably LightGBM, for accurate phenotype-gene link prediction.
Findings
LightGBM achieved AUROC 0.904 and AUCPR 0.784.
The framework outperforms other tested algorithms in link prediction accuracy.
High F1 score of 0.87 indicates reliable predictions.
Abstract
Background: The learning of genotype-phenotype associations and history of human disease by doing detailed and precise analysis of phenotypic abnormalities can be defined as deep phenotyping. To understand and detect this interaction between phenotype and genotype is a fundamental step when translating precision medicine to clinical practice. The recent advances in the field of machine learning is efficient to predict these interactions between abnormal human phenotypes and genes. Methods: In this study, we developed a framework to predict links between human phenotype ontology (HPO) and genes. The annotation data from the heterogeneous knowledge resources i.e., orphanet, is used to parse human phenotype-gene associations. To generate the embeddings for the nodes (HPO & genes), an algorithm called node2vec was used. It performs node sampling on this graph based on random walks, then…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Bioinformatics and Genomic Networks · Genetics, Bioinformatics, and Biomedical Research
Methodsnode2vec
