MedCare: Advancing Medical LLMs through Decoupling Clinical Alignment   and Knowledge Aggregation

Yusheng Liao; Shuyang Jiang; Zhe Chen; Yanfeng Wang; Yu Wang

arXiv:2406.17484·cs.CL·October 18, 2024

MedCare: Advancing Medical LLMs through Decoupling Clinical Alignment and Knowledge Aggregation

Yusheng Liao, Shuyang Jiang, Zhe Chen, Yanfeng Wang, Yu Wang

PDF

Open Access 3 Repos

TL;DR

MedCare introduces a two-stage fine-tuning approach for medical large language models, effectively decoupling clinical alignment from knowledge aggregation to improve performance across diverse medical tasks.

Contribution

The paper presents a novel two-stage training pipeline that separately handles knowledge encoding and alignment, enhancing generalization and state-of-the-art results in medical LLMs.

Findings

01

Achieves SOTA on over 20 medical tasks

02

Improves performance across various model sizes

03

Effectively mitigates knowledge forgetting

Abstract

Large language models (LLMs) have shown substantial progress in natural language understanding and generation, proving valuable especially in the medical field. Despite advancements, challenges persist due to the complexity and diversity inherent in medical tasks, which can be categorized as knowledge-intensive tasks and alignment-required tasks. Previous approaches either ignore the latter task or focus on a minority of tasks and hence lose generalization. To address these drawbacks, we propose a progressive fine-tuning pipeline. This pipeline employs a Knowledge Aggregator and a Noise aggregator to encode diverse knowledge in the first stage and filter out detrimental information. In the second stage, we drop the Noise Aggregator to avoid the interference of suboptimal representation and leverage an additional alignment module optimized towards an orthogonal direction to the knowledge…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies

MethodsFocus