Joint Self-Supervised Image-Volume Representation Learning with Intra-Inter Contrastive Clustering
Duy M. H. Nguyen, Hoang Nguyen, Mai T. N. Truong, Tri Cao, Binh T., Nguyen, Nhat Ho, Paul Swoboda, Shadi Albarqouni, Pengtao Xie, Daniel Sonntag

TL;DR
This paper introduces a joint self-supervised learning framework for 2D and 3D medical data, leveraging contrastive clustering and Transformer-based features to improve downstream medical imaging tasks.
Contribution
It proposes a novel unsupervised joint learning method for 2D and 3D medical data using contrastive clustering and Transformer-based holistic features.
Findings
Outperforms plain 2D Deep-ClusterV2 and SwAV methods.
Surpasses various modern 2D and 3D SSL approaches.
Effective in multiple downstream medical imaging tasks.
Abstract
Collecting large-scale medical datasets with fully annotated samples for training of deep networks is prohibitively expensive, especially for 3D volume data. Recent breakthroughs in self-supervised learning (SSL) offer the ability to overcome the lack of labeled training samples by learning feature representations from unlabeled data. However, most current SSL techniques in the medical field have been designed for either 2D images or 3D volumes. In practice, this restricts the capability to fully leverage unlabeled data from numerous sources, which may include both 2D and 3D data. Additionally, the use of these pre-trained networks is constrained to downstream tasks with compatible data dimensions. In this paper, we propose a novel framework for unsupervised joint learning on 2D and 3D data modalities. Given a set of 2D images or 2D slices extracted from 3D volumes, we construct an SSL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Domain Adaptation and Few-Shot Learning · AI in cancer detection
MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Softmax · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Linear Layer · Dense Connections · Adam
