Prostate-VarBench: A Benchmark with Interpretable TabNet Framework for Prostate Cancer Variant Classification
Abraham Francisco Arellano Tavara, Umesh Kumar, Jathurshan Pradeepkumar, Jimeng Sun

TL;DR
This paper introduces Prostate-VarBench, a comprehensive prostate-specific variant benchmark, and develops an interpretable TabNet model that improves prostate cancer variant classification accuracy and reduces uncertain classifications.
Contribution
It creates a harmonized, prostate-specific variant dataset and demonstrates an interpretable machine learning model for better classification of prostate cancer variants.
Findings
Achieved 89.9% accuracy on test data
Reduced VUS by 6.5% through data correction
Provided interpretable model rationales aligned with clinical review
Abstract
Variants of Uncertain Significance (VUS) limit the clinical utility of prostate cancer genomics by delaying diagnosis and therapy when evidence for pathogenicity or benignity is incomplete. Progress is further limited by inconsistent annotations across sources and the absence of a prostate-specific benchmark for fair comparison. We introduce Prostate-VarBench, a curated pipeline for creating prostate-specific benchmarks that integrates COSMIC (somatic cancer mutations), ClinVar (expert-curated clinical variants), and TCGA-PRAD (prostate tumor genomics from The Cancer Genome Atlas) into a harmonized dataset of 193,278 variants supporting patient- or gene-aware splits to prevent data leakage. To ensure data integrity, we corrected a Variant Effect Predictor (VEP) issue that merged multiple transcript records, introducing ambiguity in clinical significance fields. We then standardized 56…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProstate Cancer Diagnosis and Treatment · Machine Learning in Healthcare · AI in cancer detection
