Multiple Instance Captioning: Learning Representations from   Histopathology Textbooks and Articles

Jevgenij Gamper; Nasir Rajpoot

arXiv:2103.05121·cs.CV·March 10, 2021

Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles

Jevgenij Gamper, Nasir Rajpoot

PDF

1 Repo

TL;DR

This paper introduces ARCH, a comprehensive pathology captioning dataset with dense descriptions, demonstrating that models trained on it transfer well to various pathology tasks, surpassing traditional ImageNet features.

Contribution

The paper presents ARCH, a novel dense-captioning dataset for computational pathology, and shows its representations transfer effectively across multiple pathology tasks.

Findings

01

ARCH rivals MS-COCO in intrinsic dimensionality

02

Pre-trained ARCH models outperform ImageNet features in pathology tasks

03

ARCH-based representations transfer better than self-supervised or multi-task learned features

Abstract

We present ARCH, a computational pathology (CP) multiple instance captioning dataset to facilitate dense supervision of CP tasks. Existing CP datasets focus on narrow tasks; ARCH on the other hand contains dense diagnostic and morphological descriptions for a range of stains, tissue types and pathologies. Using intrinsic dimensionality estimation, we show that ARCH is the only CP dataset to (ARCH-)rival its computer vision analog MS-COCO Captions. We conjecture that an encoder pre-trained on dense image captions learns transferable representations for most CP tasks. We support the conjecture with evidence that ARCH representation transfers to a variety of pathology sub-tasks better than ImageNet features or representations obtained via self-supervised or multi-task learning on pathology images alone. We release our best model and invite other researchers to test it on their CP tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kdexd/virtex
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAnimatable Reconstruction of Clothed Humans