vox2vec: A Framework for Self-supervised Contrastive Learning of Voxel-level Representations in Medical Images
Mikhail Goncharov, Vera Soboleva, Anvar Kurmukov, Maxim Pisov and, Mikhail Belyaev

TL;DR
vox2vec is a self-supervised contrastive learning framework that produces multi-scale voxel representations in medical images, improving segmentation performance with fewer trainable parameters.
Contribution
It introduces a novel contrastive SSL method for voxel-level representations using a Feature Pyramid Network in medical imaging.
Findings
Outperforms existing SSL techniques in segmentation tasks
Achieves competitive performance with significantly fewer trainable parameters
Effective in linear, non-linear, and end-to-end training setups
Abstract
This paper introduces vox2vec - a contrastive method for self-supervised learning (SSL) of voxel-level representations. vox2vec representations are modeled by a Feature Pyramid Network (FPN): a voxel representation is a concatenation of the corresponding feature vectors from different pyramid levels. The FPN is pre-trained to produce similar representations for the same voxel in different augmented contexts and distinctive representations for different voxels. This results in unified multi-scale representations that capture both global semantics (e.g., body part) and local semantics (e.g., different small organs or healthy versus tumor tissue). We use vox2vec to pre-train a FPN on more than 6500 publicly available computed tomography images. We evaluate the pre-trained representations by attaching simple heads on top of them and training the resulting models for 22 segmentation tasks.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Domain Adaptation and Few-Shot Learning · Cancer-related molecular mechanisms research
MethodsConvolution · 1x1 Convolution · Feature Pyramid Network
