Evaluating Computational Pathology Foundation Models for Prostate Cancer Grading under Distribution Shifts
Fredrik K. Gustafsson, Mattias Rantalainen

TL;DR
This study benchmarks the robustness of pathology foundation models for prostate cancer grading under distribution shifts, revealing strengths in in-distribution performance but vulnerabilities in cross-site generalization.
Contribution
It provides a comprehensive evaluation of PFMs' robustness to distribution shifts in prostate cancer grading, highlighting the limitations of large-scale pretraining for generalization.
Findings
PFMs outperform natural-image baselines in in-distribution settings.
Performance drops significantly in cross-site transfer scenarios.
PFMs are less affected by label-distribution shifts, with domain shift being the main challenge.
Abstract
Pathology foundation models (PFMs) have emerged as powerful pretrained encoders for computational pathology, but their robustness under clinically relevant distribution shifts remains insufficiently understood. We benchmark the robustness of recent PFMs in the setting of prostate cancer grading from whole-slide images (WSIs). Using the PANDA dataset, we evaluate PFMs as frozen patch-level feature extractors within weakly supervised slide-level grading models, and assess robustness to two important forms of distribution shift: shifts in WSI image appearance across collection sites, and shifts in the label distribution over cancer grade groups. Across in-distribution settings, PFMs consistently achieve strong performance and clearly outperform a natural-image baseline. Under cross-site transfer from Radboud to Karolinska, however, performance drops substantially for all models, showing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
