Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification
Gexin Huang, Chenfei Wu, Mingjie Li, Xiaojun Chang, Ling Chen, Ying, Sun, Shen Zhao, Xiaodan Liang, and Liang Lin

TL;DR
This paper introduces a novel multi-label Transformer model, BPGT, that leverages biomedical knowledge and gene relationships to improve genetic mutation prediction from whole slide images, addressing inefficiencies and biological oversight in prior methods.
Contribution
The paper proposes a biological-knowledge enhanced Transformer architecture with a gene encoder and label decoder, integrating linguistic and biomedical knowledge for better mutation prediction.
Findings
BPGT outperforms state-of-the-art methods on TCGA benchmark.
The gene encoder effectively captures gene relationships and priors.
The model improves prediction accuracy by emphasizing mutation status comparisons.
Abstract
Predicting genetic mutations from whole slide images is indispensable for cancer diagnosis. However, existing work training multiple binary classification models faces two challenges: (a) Training multiple binary classifiers is inefficient and would inevitably lead to a class imbalance problem. (b) The biological relationships among genes are overlooked, which limits the prediction performance. To tackle these challenges, we innovatively design a Biological-knowledge enhanced PathGenomic multi-label Transformer to improve genetic mutation prediction performances. BPGT first establishes a novel gene encoder that constructs gene priors by two carefully designed modules: (a) A gene graph whose node features are the genes' linguistic descriptions and the cancer phenotype, with edges modeled by genes' pathway associations and mutation consistencies. (b) A knowledge association module that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Genetics, Bioinformatics, and Biomedical Research · Biomedical Text Mining and Ontologies
MethodsAttention Is All You Need · Softmax · Layer Normalization · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention
