Understanding Breast Cancer Survival: Using Causality and Language Models on Multi-omics Data
Mugariya Farooq, Shahad Hardan, Aigerim Zhumbhayeva, Yujia Zheng,, Preslav Nakov, Kun Zhang

TL;DR
This paper explores the use of causal discovery algorithms combined with large language models to identify and validate factors affecting breast cancer survival from multi-omics data, aiming for more explainable clinical insights.
Contribution
It introduces a novel approach integrating causal discovery methods with language models to evaluate causal relations in breast cancer prognosis from genomic data.
Findings
Identified key causal factors related to patient survival.
Validated causal findings using biomedical language models.
Highlighted the potential and challenges of causal discovery in healthcare.
Abstract
The need for more usable and explainable machine learning models in healthcare increases the importance of developing and utilizing causal discovery algorithms, which aim to discover causal relations by analyzing observational data. Explainable approaches aid clinicians and biologists in predicting the prognosis of diseases and suggesting proper treatments. However, very little research has been conducted at the crossroads between causal discovery, genomics, and breast cancer, and we aim to bridge this gap. Moreover, evaluation of causal discovery methods on real data is in general notoriously difficult because ground-truth causal relations are usually unknown, and accordingly, in this paper, we also propose to address the evaluation problem with large language models. In particular, we exploit suitable causal discovery algorithms to investigate how various perturbations in the genome…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Genetics, Bioinformatics, and Biomedical Research · Bioinformatics and Genomic Networks
