Efficiency at Scale: Investigating the Performance of Diminutive Language Models in Clinical Tasks
Niall Taylor, Upamanyu Ghose, Omid Rohanian, Mohammadmahdi Nouriborji,, Andrey Kormilitzin, David Clifton, Alejo Nevado-Holgado

TL;DR
This paper evaluates the effectiveness of parameter-efficient fine-tuning methods, especially LoRA, on small and large language models for clinical decision tasks, highlighting efficiency and performance trade-offs.
Contribution
It systematically compares PEFT methods across model sizes in clinical tasks, emphasizing LoRA's robustness and the benefits of small models with domain-specific pre-training.
Findings
LoRA maintains high performance across tasks and sizes
Small models with PEFT can match large model performance
PEFT offers cost-effective solutions for clinical NLP
Abstract
The entry of large language models (LLMs) into research and commercial spaces has led to a trend of ever-larger models, with initial promises of generalisability, followed by a widespread desire to downsize and create specialised models without the need for complete fine-tuning, using Parameter Efficient Fine-tuning (PEFT) methods. We present an investigation into the suitability of different PEFT methods to clinical decision-making tasks, across a range of model sizes, including extremely small models with as few as million parameters. Our analysis shows that the performance of most PEFT approaches varies significantly from one task to another, with the exception of LoRA, which maintains relatively high performance across all model sizes and tasks, typically approaching or matching full fine-tuned performance. The effectiveness of PEFT methods in the clinical domain is evident,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
