Efficient Medical Question Answering with Knowledge-Augmented Question Generation
Julien Khlaut, Corentin Dancette, Elodie Ferreres, Alaedine Bennani,, Paul H\'erent, Pierre Manceron

TL;DR
This paper presents a method to enhance small language models' medical question answering capabilities by combining fine-tuning on textbooks and GPT-4 generated questions, supported by a new dataset ECN-QA.
Contribution
It introduces a novel training strategy using textbook fine-tuning and GPT-4 question generation, along with the ECN-QA dataset for medical QA evaluation.
Findings
Small models improve significantly with the proposed training method.
ECN-QA dataset enables evaluation of progressive medical questions.
Fine-tuning on textbooks and generated questions enhances medical QA performance.
Abstract
In the expanding field of language model applications, medical knowledge representation remains a significant challenge due to the specialized nature of the domain. Large language models, such as GPT-4, obtain reasonable scores on medical question answering tasks, but smaller models are far behind. In this work, we introduce a method to improve the proficiency of a small language model in the medical domain by employing a two-fold approach. We first fine-tune the model on a corpus of medical textbooks. Then, we use GPT-4 to generate questions similar to the downstream task, prompted with textbook knowledge, and use them to fine-tune the model. Additionally, we introduce ECN-QA, a novel medical question answering dataset containing ``progressive questions'' composed of related sequential questions. We show the benefits of our training strategy on this dataset. The study's findings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling
MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Multi-Head Attention · Residual Connection · Byte Pair Encoding · Label Smoothing · Adam · Absolute Position Encodings · Dropout
