Fixed-Budget Parameter-Efficient Training with Frozen Encoders Improves Multimodal Chest X-Ray Classification
Md Ashik Khan, Md Nahid Siddique

TL;DR
This study demonstrates that fixed-budget parameter-efficient training with frozen encoders significantly improves multimodal chest X-ray classification accuracy while reducing computational costs, compared to full fine-tuning of large models.
Contribution
It introduces and evaluates parameter-efficient training strategies, including frozen encoders, for multimodal chest X-ray classification, showing superior performance under fixed parameter budgets.
Findings
PET methods outperform full fine-tuning in AUROC.
All PET variants achieve >0.89 AUROC with 2.37M parameters.
External validation confirms scalability and effectiveness.
Abstract
Multimodal chest X-Ray analysis often fine-tunes large vision-language models, which is computationally costly. We study parameter-efficient training (PET) strategies, including frozen encoders, BitFit, LoRA, and adapters for multi-label classification on the Indiana University Chest X-Ray dataset (3,851 image-report pairs; 579 test samples). To mitigate data leakage, we redact pathology terms from reports used as text inputs while retaining clinical context. Under a fixed parameter budget (2.37M parameters, 2.51% of total), all PET variants achieve AUROC between 0.892 and 0.908, outperforming full fine-tuning (0.770 AUROC), which uses 94.3M trainable parameters, a 40x reduction. External validation on CheXpert (224,316 images, 58x larger) confirms scalability: all PET methods achieve >0.69 AUROC with <9% trainable parameters, with Adapter achieving best performance (0.7214 AUROC).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · AI in cancer detection · Domain Adaptation and Few-Shot Learning
