A protocol for evaluating robustness to H&E staining variation in computational pathology models
Lydia A. Sch\"onpflug, Nikki van den Berg, Sonali Andani, Nanda Horeweg, Jurriaan Barkey Wolf, Tjalling Bosse, Viktor H. Koelzer, Maxime W. Lafarge

TL;DR
This paper introduces a three-step protocol to systematically evaluate how variations in H&E staining affect the robustness and performance of computational pathology models, aiding reliable deployment.
Contribution
The authors developed a novel evaluation protocol and created a reference staining library to assess model robustness to staining variability in computational pathology.
Findings
Classification performance ranged from AUC 0.769 to 0.911.
Robustness ranged from 0.007 to 0.079.
Weak inverse correlation between robustness and performance.
Abstract
Sensitivity to staining variation remains a major barrier to deploying computational pathology (CPath) models as hematoxylin and eosin (H&E) staining varies across laboratories, requiring systematic assessment of how this variability affects model prediction. In this work, we developed a three-step protocol for evaluating robustness to H&E staining variation in CPath models. Step 1: Select reference staining conditions, Step 2: Characterize test set staining properties, Step 3: Apply CPath model(s) under simulated reference staining conditions. Here, we first created a new reference staining library based on the PLISM dataset. As an exemplary use case, we applied the protocol to assess the robustness properties of 306 microsatellite instability (MSI) classification models on the unseen SurGen colorectal cancer dataset (n=738), including 300 attention-based multiple instance learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Digital Imaging for Blood Diseases · Gene expression and cancer classification
