Genotype-Phenotype Integration through Machine Learning and Personalized Gene Regulatory Networks for Cancer Metastasis Prediction
Jiwei Fu, Chunyu Yang

TL;DR
This study develops an integrated machine learning framework combining classical models and graph neural networks to improve personalized prediction of cancer metastasis by leveraging gene expression and regulatory network data.
Contribution
It introduces a novel approach that combines traditional ML with graph neural networks and patient-specific regulatory networks for metastasis prediction.
Findings
XGBoost achieved AUROC of 0.7051
GNN captured non-linear regulatory dependencies
Framework enables scalable, interpretable metastasis risk prediction
Abstract
Metastasis is the leading cause of cancer-related mortality, yet most predictive models rely on shallow architectures and neglect patient-specific regulatory mechanisms. Here, we integrate classical machine learning and deep learning to predict metastatic potential across multiple cancer types. Gene expression profiles from the Cancer Cell Line Encyclopedia were combined with a transcription factor-target prior from DoRothEA, focusing on nine metastasis-associated regulators. After selecting differential genes using the Kruskal-Wallis test, ElasticNet, Random Forest, and XGBoost models were trained for benchmarking. Personalized gene regulatory networks were then constructed using PANDA and LIONESS and analyzed through a graph attention neural network (GATv2) to learn topological and expression-based representations. While XGBoost achieved the highest AUROC (0.7051), the GNN captured…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
