SimMIL: A Universal Weakly Supervised Pre-Training Framework for Multi-Instance Learning in Whole Slide Pathology Images

Yicheng Song; Tiancheng Lin; Die Peng; Su Yang; Yi Xu

arXiv:2505.06710·cs.CV·May 13, 2025

SimMIL: A Universal Weakly Supervised Pre-Training Framework for Multi-Instance Learning in Whole Slide Pathology Images

Yicheng Song, Tiancheng Lin, Die Peng, Su Yang, Yi Xu

PDF

Open Access

TL;DR

This paper introduces SimMIL, a weakly-supervised pre-training framework for multi-instance learning in whole slide pathology images, improving feature extraction and downstream task performance.

Contribution

It presents the first dedicated weakly-supervised pre-training scheme for MIL, enhancing instance-level feature learning in pathology images.

Findings

01

Outperforms ImageNet and self-supervised pre-training methods

02

Effective across multiple large-scale WSI datasets

03

Scalable to multi-dataset pre-training and fine-tuning

Abstract

Various multi-instance learning (MIL) based approaches have been developed and successfully applied to whole-slide pathological images (WSI). Existing MIL methods emphasize the importance of feature aggregators, but largely neglect the instance-level representation learning. They assume that the availability of a pre-trained feature extractor can be directly utilized or fine-tuned, which is not always the case. This paper proposes to pre-train feature extractor for MIL via a weakly-supervised scheme, i.e., propagating the weak bag-level labels to the corresponding instances for supervised learning. To learn effective features for MIL, we further delve into several key components, including strong data augmentation, a non-linear prediction head and the robust loss function. We conduct experiments on common large-scale WSI datasets and find it achieves better performance than other…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection · Image Retrieval and Classification Techniques · Digital Imaging for Blood Diseases