Self-supervised 3D anatomy segmentation using self-distilled masked   image transformer (SMIT)

Jue Jiang; Neelam Tyagi; Kathryn Tringale; Christopher Crane; Harini; Veeraraghavan

arXiv:2205.10342·eess.IV·September 27, 2022

Self-supervised 3D anatomy segmentation using self-distilled masked image transformer (SMIT)

Jue Jiang, Neelam Tyagi, Kathryn Tringale, Christopher Crane, Harini, Veeraraghavan

PDF

2 Repos

TL;DR

This paper introduces SMIT, a self-supervised learning method using masked image modeling and self-distillation for 3D medical image segmentation with vision transformers, reducing data needs and improving accuracy.

Contribution

The paper proposes a novel self-distillation masked image transformer (SMIT) approach for SSL in 3D medical segmentation, combining dense pixel-wise prediction with token distillation.

Findings

01

Achieved high accuracy with average DSC of 0.875 (MRI) and 0.878 (CT).

02

Required fewer fine-tuning datasets than other methods.

03

Validated across multiple organs and imaging modalities.

Abstract

Vision transformers, with their ability to more efficiently model long-range context, have demonstrated impressive accuracy gains in several computer vision and medical image analysis tasks including segmentation. However, such methods need large labeled datasets for training, which is hard to obtain for medical image analysis. Self-supervised learning (SSL) has demonstrated success in medical image segmentation using convolutional networks. In this work, we developed a \underline{s}elf-distillation learning with \underline{m}asked \underline{i}mage modeling method to perform SSL for vision \underline{t}ransformers (SMIT) applied to 3D multi-organ segmentation from CT and MRI. Our contribution is a dense pixel-wise regression within masked patches called masked image prediction, which we combined with masked patch token distillation as pretext task to pre-train vision transformers. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.