SA-MDKIF: A Scalable and Adaptable Medical Domain Knowledge Injection   Framework for Large Language Models

Tianhan Xu; Zhe Hu; Ling Chen; Bin Li

arXiv:2402.00474·cs.CL·February 2, 2024·2 cites

SA-MDKIF: A Scalable and Adaptable Medical Domain Knowledge Injection Framework for Large Language Models

Tianhan Xu, Zhe Hu, Ling Chen, Bin Li

PDF

Open Access

TL;DR

SA-MDKIF is a framework that injects medical knowledge into large language models through instruction tuning, significantly enhancing their performance on various medical tasks, especially unseen ones.

Contribution

It introduces a scalable, adaptable method for medical knowledge injection into LLMs using skill training and routing, improving performance on diverse medical tasks.

Findings

01

Performance improved by 10-20% on 9 medical tasks.

02

Notable 30% improvement on unseen medical tasks.

03

Framework effectively enhances LLMs' medical domain capabilities.

Abstract

Recent advances in large language models (LLMs) have demonstrated exceptional performance in various natural language processing (NLP) tasks. However, their effective application in the medical domain is hampered by a lack of medical domain knowledge. In this study, we present SA-MDKIF, a scalable and adaptable framework that aims to inject medical knowledge into general-purpose LLMs through instruction tuning, thereby enabling adaptability for various downstream tasks. SA-MDKIF consists of two stages: skill training and skill adaptation. In the first stage, we define 12 basic medical skills and use AdaLoRA to train these skills based on uniformly formatted instructional datasets that we have constructed. In the next stage, we train the skill router using task-specific downstream data and use this router to integrate the acquired skills with LLMs during inference. Experimental results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling