# A Multi-Stage Fine-Tuning and Ensembling Strategy for Pancreatic Tumor Segmentation in Diagnostic and Therapeutic MRI

**Authors:** Omer Faruk Durugol, Maximilian Rokuss, Yannick Kirchhoff, Klaus H. Maier-Hein

arXiv: 2508.21775 · 2025-09-01

## TL;DR

This paper introduces a multi-stage fine-tuning and ensembling approach based on nnU-Net for pancreatic tumor segmentation in MRI, achieving state-of-the-art accuracy despite limited data and challenging imaging conditions.

## Contribution

It proposes a novel multi-stage pre-training and ensemble strategy that improves segmentation accuracy in pancreatic MRI by leveraging heterogeneous models and optimized data augmentation.

## Key findings

- Aggressive data augmentation improves volumetric accuracy.
- Default augmentations yield better boundary precision.
- Ensembling specialist models enhances overall segmentation performance.

## Abstract

Automated segmentation of Pancreatic Ductal Adenocarcinoma (PDAC) from MRI is critical for clinical workflows but is hindered by poor tumor-tissue contrast and a scarcity of annotated data. This paper details our submission to the PANTHER challenge, addressing both diagnostic T1-weighted (Task 1) and therapeutic T2-weighted (Task 2) segmentation. Our approach is built upon the nnU-Net framework and leverages a deep, multi-stage cascaded pre-training strategy, starting from a general anatomical foundation model and sequentially fine-tuning on CT pancreatic lesion datasets and the target MRI modalities. Through extensive five-fold cross-validation, we systematically evaluated data augmentation schemes and training schedules. Our analysis revealed a critical trade-off, where aggressive data augmentation produced the highest volumetric accuracy, while default augmentations yielded superior boundary precision (achieving a state-of-the-art MASD of 5.46 mm and HD95 of 17.33 mm for Task 1). For our final submission, we exploited this finding by constructing custom, heterogeneous ensembles of specialist models, essentially creating a mix of experts. This metric-aware ensembling strategy proved highly effective, achieving a top cross-validation Tumor Dice score of 0.661 for Task 1 and 0.523 for Task 2. Our work presents a robust methodology for developing specialized, high-performance models in the context of limited data and complex medical imaging tasks (Team MIC-DKFZ).

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.21775/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/2508.21775/full.md

## References

7 references — full list in the complete paper: https://tomesphere.com/paper/2508.21775/full.md

---
Source: https://tomesphere.com/paper/2508.21775