Do Histopathological Foundation Models Eliminate Batch Effects? A Comparative Study
Jonah K\"omen, Hannah Marienwald, Jonas Dippel, Julius Hense

TL;DR
This study evaluates whether recent large-scale histopathological foundation models effectively eliminate batch effects across hospitals, revealing persistent biases in feature embeddings despite their high generalization performance.
Contribution
It systematically assesses the presence of hospital-specific signatures in foundation model embeddings and highlights limitations of stain normalization in removing batch effects.
Findings
Foundation model embeddings still contain hospital signatures.
Stain normalization does not remove hospital-specific biases.
Hospital signatures influence model predictions and can cause biases.
Abstract
Deep learning has led to remarkable advancements in computational histopathology, e.g., in diagnostics, biomarker prediction, and outcome prognosis. Yet, the lack of annotated data and the impact of batch effects, e.g., systematic technical data differences across hospitals, hamper model robustness and generalization. Recent histopathological foundation models -- pretrained on millions to billions of images -- have been reported to improve generalization performances on various downstream tasks. However, it has not been systematically assessed whether they fully eliminate batch effects. In this study, we empirically show that the feature embeddings of the foundation models still contain distinct hospital signatures that can lead to biased predictions and misclassifications. We further find that the signatures are not removed by stain normalization methods, dominate distances in feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Radiomics and Machine Learning in Medical Imaging
