A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study
Sarah N Dudgeon (1), Si Wen (1), Matthew G Hanna (2), Rajarsi Gupta, (3), Mohamed Amgad (4), Manasi Sheth (5), Hetal Marble (6), Richard Huang, (6), Markus D Herrmann (7), Clifford H. Szu (8), Darick Tong (8), Bruce, Werness (8), Evan Szu (8), Denis Larsimont (9)

TL;DR
This paper describes the creation of a pathologist-annotated dataset for validating AI algorithms in estimating stromal tumor infiltrating lymphocytes in breast cancer, including workflows, pilot results, and plans for regulatory readiness.
Contribution
It introduces new workflows for efficient annotation, pilot data demonstrating variability and correlations, and statistical methods for validation in digital pathology.
Findings
sTIL densities are correlated within cases
Notable pathologist variability observed
Workflows enable efficient data collection
Abstract
Purpose: In this work, we present a collaboration to create a validation dataset of pathologist annotations for algorithms that process whole slide images (WSIs). We focus on data collection and evaluation of algorithm performance in the context of estimating the density of stromal tumor infiltrating lymphocytes (sTILs) in breast cancer. Methods: We digitized 64 glass slides of hematoxylin- and eosin-stained ductal carcinoma core biopsies prepared at a single clinical site. We created training materials and workflows to crowdsource pathologist image annotations on two modes: an optical microscope and two digital platforms. The workflows collect the ROI type, a decision on whether the ROI is appropriate for estimating the density of sTILs, and if appropriate, the sTIL density value for that ROI. Results: The pilot study yielded an abundant number of cases with nominal sTIL infiltration.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
