A Comprehensive Study on Fine-Tuning Large Language Models for Medical   Question Answering Using Classification Models and Comparative Analysis

Aysegul Ucar; Soumik Nayak; Anunak Roy; Burak Ta\c{s}c{\i}; G\"ulay; Ta\c{s}c{\i}

arXiv:2501.17190·cs.CL·January 30, 2025

A Comprehensive Study on Fine-Tuning Large Language Models for Medical Question Answering Using Classification Models and Comparative Analysis

Aysegul Ucar, Soumik Nayak, Anunak Roy, Burak Ta\c{s}c{\i}, G\"ulay, Ta\c{s}c{\i}

PDF

Open Access

TL;DR

This study evaluates various large language models like RoBERTa and BERT for medical question answering, demonstrating high accuracy and efficiency in classifying questions and providing reliable answers using a dataset from Healthline.

Contribution

It introduces a two-stage approach for medical QA using classification models and provides a comprehensive comparative analysis of different LLMs' performance.

Findings

01

BERT Large Uncased achieved 100% accuracy and F1 score.

02

Roberta-base demonstrated near-perfect performance with over 99% accuracy.

03

LoRA Roberta-large showed promising results with 78.47% accuracy.

Abstract

This paper presents the overview of the development and fine-tuning of large language models (LLMs) designed specifically for answering medical questions. We are mainly improving the accuracy and efficiency of providing reliable answers to medical queries. In our approach, we have two stages, prediction of a specific label for the received medical question and then providing a predefined answer for this label. Various models such as RoBERTa and BERT were examined and evaluated based on their ability. The models are trained using the datasets derived from 6,800 samples that were scraped from Healthline. com with additional synthetic data. For evaluation, we conducted a comparative study using 5-fold cross-validation. For accessing performance we used metrics like, accuracy, precision, recall, and F1 score and also recorded the training time. The performance of the models was evaluated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Layer Normalization · Softmax · Linear Warmup With Linear Decay · Adam · Residual Connection · Dropout · Linear Layer · Dense Connections