Improving Automatic Summarization of Radiology Reports through Mid-Training of Large Language Models

Mengxian Lyu; Cheng Peng; Ziyi Chen; Mengyuan Zhang; Jieting Li Lu; Yonghui Wu

arXiv:2603.19275·cs.CL·April 13, 2026

Improving Automatic Summarization of Radiology Reports through Mid-Training of Large Language Models

Mengxian Lyu, Cheng Peng, Ziyi Chen, Mengyuan Zhang, Jieting Li Lu, Yonghui Wu

PDF

TL;DR

This study enhances radiology report summarization by introducing a mid-training step for large language models, leading to improved performance and reduced cold start issues.

Contribution

It proposes a novel subdomain adaptation via mid-training for LLMs, outperforming traditional pre-training and fine-tuning approaches in radiology report summarization.

Findings

01

Mid-trained GatorTronT5-Radio outperforms non-mid-trained models in ROUGE-L and RadGraph-F1.

02

Mid-training improves few-shot learning and alleviates cold start problems.

03

Pre-training, mid-training, fine-tuning strategy is more effective than direct fine-tuning.

Abstract

Automatic summarization of radiology reports is an essential application to reduce the burden on physicians. Previous studies have widely used the "pre-training, fine-tuning" strategy to adapt large language models (LLMs) for summarization. This study proposed a subdomain adaptation through a mid-training method to improve summarization. We explored three adaptation strategies: (1) general-domain pre-training, (2) clinical-domain pre-training, and (3) clinical-domain pre-training followed by subdomain mid-training. We developed models using large-scale clinical text from the University of Florida (UF) Health and conducted mid-training and fine-tuning experiments using widely used benchmark datasets including OpenI and MIMIC-CXR. The experimental results show that the mid-trained model, GatorTronT5-Radio, achieved the best performance, outperforming models without mid-training in both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.