Bridging the Domain Divide: Supervised vs. Zero-Shot Clinical Section Segmentation from MIMIC-III to Obstetrics
Baris Karacan, Barbara Di Eugenio, Patrick Thornton

TL;DR
This paper compares supervised and zero-shot models for clinical section segmentation, introducing a new obstetrics dataset and highlighting zero-shot models' robustness across domains.
Contribution
It introduces a new obstetrics notes dataset, evaluates transformer-based models in and out of domain, and compares supervised and zero-shot approaches for clinical segmentation.
Findings
Supervised models perform well in-domain but poorly out-of-domain.
Zero-shot models are more robust out-of-domain with proper hallucination correction.
Developing domain-specific resources enhances segmentation performance.
Abstract
Clinical free-text notes contain vital patient information. They are structured into labelled sections; recognizing these sections has been shown to support clinical decision-making and downstream NLP tasks. In this paper, we advance clinical section segmentation through three key contributions. First, we curate a new de-identified, section-labeled obstetrics notes dataset, to supplement the medical domains covered in public corpora such as MIMIC-III, on which most existing segmentation approaches are trained. Second, we systematically evaluate transformer-based supervised models for section segmentation on a curated subset of MIMIC-III (in-domain), and on the new obstetrics dataset (out-of-domain). Third, we conduct the first head-to-head comparison of supervised models for medical section segmentation with zero-shot large language models. Our results show that while supervised models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
