MOSMOS: Multi-organ segmentation facilitated by medical report   supervision

Weiwei Tian; Xinyu Huang; Junlin Hou; Caiyue Ren; Longquan Jiang,; Rui-Wei Zhao; Gang Jin; Yuejie Zhang; Daoying Geng

arXiv:2409.02418·cs.CV·September 5, 2024

MOSMOS: Multi-organ segmentation facilitated by medical report supervision

Weiwei Tian, Xinyu Huang, Junlin Hou, Caiyue Ren, Longquan Jiang,, Rui-Wei Zhao, Gang Jin, Yuejie Zhang, Daoying Geng

PDF

Open Access

TL;DR

This paper introduces MOSMOS, a novel pre-training and fine-tuning framework that leverages medical report supervision to improve multi-organ segmentation across various datasets and models.

Contribution

The paper proposes a new framework combining contrastive learning and multi-label recognition to enhance fine-grained multi-organ segmentation using report supervision.

Findings

01

Effective across multiple datasets and modalities

02

Improves segmentation accuracy with report supervision

03

Generalizes to different network architectures

Abstract

Owing to a large amount of multi-modal data in modern medical systems, such as medical images and reports, Medical Vision-Language Pre-training (Med-VLP) has demonstrated incredible achievements in coarse-grained downstream tasks (i.e., medical classification, retrieval, and visual question answering). However, the problem of transferring knowledge learned from Med-VLP to fine-grained multi-organ segmentation tasks has barely been investigated. Multi-organ segmentation is challenging mainly due to the lack of large-scale fully annotated datasets and the wide variation in the shape and size of the same organ between individuals with different diseases. In this paper, we propose a novel pre-training & fine-tuning framework for Multi-Organ Segmentation by harnessing Medical repOrt Supervision (MOSMOS). Specifically, we first introduce global contrastive learning to maximally align the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection · Radiomics and Machine Learning in Medical Imaging

MethodsAttention Is All You Need · Dense Connections · Softmax · Position-Wise Feed-Forward Layer · Linear Layer · Batch Normalization · 1x1 Convolution · Residual Connection · Multi-Head Attention · Max Pooling