Compound Figure Separation of Biomedical Images: Mining Large Datasets for Self-supervised Learning
Tianyuan Yao, Chang Qu, Jun Long, Quan Liu, Ruining Deng, Yuanhan, Tian, Jiachen Xu, Aadarsh Jha, Zuhayr Asad, Shunxing Bao, Mengyang Zhao,, Agnes B. Fogo, Bennett A.Landman, Haichun Yang, Catie Chang, Yuankai Huo

TL;DR
This paper introduces SimCFS, a novel framework for separating compound biomedical images into individual images without bounding box annotations, enhancing large-scale data collection for self-supervised learning in medical imaging.
Contribution
The study presents a resource-efficient, annotation-free method for compound figure separation, enabling better utilization of large unannotated biomedical image datasets for self-supervised learning.
Findings
Achieved state-of-the-art performance on ImageCLEF 2016 dataset.
Pretrained models improved downstream image classification accuracy.
Proposed method reduces reliance on extensive bounding box annotations.
Abstract
With the rapid development of self-supervised learning (e.g., contrastive learning), the importance of having large-scale images (even without annotations) for training a more generalizable AI model has been widely recognized in medical image analysis. However, collecting large-scale task-specific unannotated data at scale can be challenging for individual labs. Existing online resources, such as digital books, publications, and search engines, provide a new resource for obtaining large-scale images. However, published images in healthcare (e.g., radiology and pathology) consist of a considerable amount of compound figures with subplots. In order to extract and separate compound figures into usable individual images for downstream learning, we propose a simple compound figure separation (SimCFS) framework without using the traditionally required detection bounding box annotations, with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsContrastive Learning
