Towards Optimal Patch Size in Vision Transformers for Tumor Segmentation

Ramtin Mojtahedi; Mohammad Hamghalam; Richard K. G. Do; and Amber L.; Simpson

arXiv:2308.16598·eess.IV·September 1, 2023

Towards Optimal Patch Size in Vision Transformers for Tumor Segmentation

Ramtin Mojtahedi, Mohammad Hamghalam, Richard K. G. Do, and Amber L., Simpson

PDF

1 Repo

TL;DR

This paper introduces a method to select the optimal input patch size for vision transformers in tumor segmentation, improving accuracy especially for small lesions by using a volume-based approach and transfer learning.

Contribution

It proposes a novel technique to determine the best patch size based on lesion volume, enhancing vision transformer performance in medical image segmentation.

Findings

01

Optimal patch size improves segmentation accuracy for small tumors.

02

Transfer learning with larger tumor volumes enhances performance on smaller lesions.

03

The method demonstrates consistent improvements on multi-resolution datasets.

Abstract

Detection of tumors in metastatic colorectal cancer (mCRC) plays an essential role in the early diagnosis and treatment of liver cancer. Deep learning models backboned by fully convolutional neural networks (FCNNs) have become the dominant model for segmenting 3D computerized tomography (CT) scans. However, since their convolution layers suffer from limited kernel size, they are not able to capture long-range dependencies and global context. To tackle this restriction, vision transformers have been introduced to solve FCNN's locality of receptive fields. Although transformers can capture long-range features, their segmentation performance decreases with various tumor sizes due to the model sensitivity to the input patch size. While finding an optimal patch size improves the performance of vision transformer-based models on segmentation tasks, it is a time-consuming and challenging…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ramtin-mojtahedi/ovtps
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConvolution