Inference of captions from histopathological patches
Masayuki Tsuneki, Fahdi Kanavati

TL;DR
This paper introduces a new dataset of gastric cancer histopathology images with captions and demonstrates an initial model to generate diagnostic captions, aiming to enhance clinical workflows.
Contribution
The study provides a large, publicly available dataset of histopathological patches with captions and a baseline attention-based model for caption prediction.
Findings
Promising caption prediction results from the baseline model
Creation of a 262K patch dataset with diagnostic captions
Facilitates future research in automated histopathology reporting
Abstract
Computational histopathology has made significant strides in the past few years, slowly getting closer to clinical adoption. One area of benefit would be the automatic generation of diagnostic reports from H\&E-stained whole slide images which would further increase the efficiency of the pathologists' routine diagnostic workflows. In this study, we compiled a dataset (PatchGastricADC22) of histopathological captions of stomach adenocarcinoma endoscopic biopsy specimens, which we extracted from diagnostic reports and paired with patches extracted from the associated whole slide images. The dataset contains a variety of gastric adenocarcinoma subtypes. We trained a baseline attention-based model to predict the captions from features extracted from the patches and obtained promising results. We make the captioned dataset of 262K patches publicly available.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Colorectal Cancer Screening and Detection · Cancer-related molecular mechanisms research
