Learning to Diagnose Privately: DP-Powered LLMs for Radiology Report Classification

Payel Bhattacharjee; Fengwei Tian; Geoffrey D. Rubin; Joseph Y. Lo; Nirav Merchant; Heidi Hanson; John Gounley; Ravi Tandon

arXiv:2506.04450·cs.CR·March 31, 2026

Learning to Diagnose Privately: DP-Powered LLMs for Radiology Report Classification

Payel Bhattacharjee, Fengwei Tian, Geoffrey D. Rubin, Joseph Y. Lo, Nirav Merchant, Heidi Hanson, John Gounley, Ravi Tandon

PDF

TL;DR

This paper introduces a differentially private fine-tuning framework for large language models to classify multiple abnormalities in radiology reports, balancing privacy with high classification accuracy.

Contribution

It proposes a novel DP-LoRA method combining differential privacy with Low-Rank Adaptation for privacy-preserving medical report classification.

Findings

01

Achieves weighted F1-score up to 0.89 under moderate privacy budgets.

02

Performance approaches non-private LoRA (0.90) and full fine-tuning (0.96).

03

Demonstrates effective privacy-utility trade-offs on MIMIC-CXR and CT-RATE datasets.

Abstract

Large Language Models (LLMs) are increasingly adopted across domains such as education, healthcare, and finance. In healthcare, LLMs support tasks including disease diagnosis, abnormality classification, and clinical decision-making. Among these, multi-abnormality classification of radiology reports is critical for clinical workflow automation and biomedical research. Leveraging strong natural language processing capabilities, LLMs enable efficient processing of unstructured medical text and reduce the administrative burden of manual report analysis. To improve performance, LLMs are often fine-tuned on private, institution-specific datasets such as radiology reports. However, this raises significant privacy concerns: LLMs may memorize training data and become vulnerable to data extraction attacks, while sharing fine-tuned models risks exposing sensitive patient information. Despite…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.