Lizard: A Large-Scale Dataset for Colonic Nuclear Instance Segmentation and Classification
Simon Graham, Mostafa Jahanifar, Ayesha Azam, Mohammed Nimir, Yee-Wah, Tsang, Katherine Dodd, Emily Hero, Harvir Sahota, Atisha Tank, Ksenija Benes,, Noorul Wahab, Fayyaz Minhas, Shan E Ahmed Raza, Hesham El Daly, Kishore, Gopalakrishnan, David Snead, Nasir Rajpoot

TL;DR
This paper introduces Lizard, a large-scale dataset for colonic nuclear segmentation and classification, created through a multi-stage annotation pipeline with pathologist-in-the-loop refinement, to advance computational pathology models.
Contribution
The paper presents the creation of the largest nuclear segmentation and classification dataset for colon tissue, utilizing a novel multi-stage annotation process with expert input.
Findings
Largest dataset with nearly 500,000 labeled nuclei
Effective multi-stage annotation pipeline with expert refinement
Facilitates development of improved pathology models
Abstract
The development of deep segmentation models for computational pathology (CPath) can help foster the investigation of interpretable morphological biomarkers. Yet, there is a major bottleneck in the success of such approaches because supervised deep learning models require an abundance of accurately labelled data. This issue is exacerbated in the field of CPath because the generation of detailed annotations usually demands the input of a pathologist to be able to distinguish between different tissue constructs and nuclei. Manually labelling nuclei may not be a feasible approach for collecting large-scale annotated datasets, especially when a single image region can contain thousands of different cells. However, solely relying on automatic generation of annotations will limit the accuracy and reliability of ground truth. Therefore, to help overcome the above challenges, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
