TL;DR
RadLite demonstrates that small, multi-task radiology language models can be effectively fine-tuned with LoRA and deployed on consumer CPUs, enabling accessible radiology AI applications.
Contribution
This work introduces RadLite, a method for multi-task radiology AI using small language models fine-tuned with LoRA for CPU deployment.
Findings
LoRA fine-tuning significantly improves model performance.
Qwen2.5-3B excels at structured generation, Qwen3-4B at extractive tasks.
Ensembling models yields the best overall performance.
Abstract
Large language models (LLMs) show promise in radiology but their deployment is limited by computational requirements that preclude use in resource-constrained clinical environments. We investigate whether small language models (SLMs) of 3-4 billion parameters can achieve strong multi-task radiology performance through LoRA fine-tuning, enabling deployment on consumer-grade CPUs. We train Qwen2.5-3B-Instruct and Qwen3-4B on 162K samples spanning 9 radiology tasks - RADS classification across 10 systems, impression generation, temporal comparison, radiology NLI, NER, abnormality detection, N/M staging, and radiology Q&A - compiled from 12 public datasets. Both models are evaluated on up to 500 held-out test samples per task with standardized metrics. Our key findings are: (1) LoRA fine-tuning dramatically improves performance over zero-shot baselines (RADS accuracy +53%, NLI +60%,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
