A Novel Hierarchical Integration Method for Efficient Model Merging in Medical LLMs
Prakrit Timilsina, Anuj Nepal, Rajan Kadel, Robin Doss

TL;DR
This paper introduces a hierarchical model merging method for medical LLMs that enhances efficiency and performance in distributed healthcare settings, demonstrating that simple averaging can be highly effective.
Contribution
The paper proposes a novel hierarchical merging technique combining OT alignment and cosine similarity, improving model integration for medical LLMs while reducing computational costs.
Findings
Hierarchical merging improves accuracy on medical benchmarks.
Task Arithmetic outperforms complex pruning methods.
Simple averaging is effective for architecturally compatible models.
Abstract
Large Language Models (LLMs) face significant challenges in distributed healthcare, including consolidating specialized domain knowledge across institutions while maintaining privacy, reducing computational overhead, and preventing catastrophic forgetting during model updates.This paper presents a systematic evaluation of six parameter-space merging techniques applied to two architecturally compatible medical LLMs derived from the Mistral-7B base model. We introduce a novel hierarchical method that combines selective Optimal Transport (OT) alignment for attention layers with cosine similarity-weighted interpolation, designed to address permutation variance while minimizing computational overhead for edge deployment scenarios. Our study evaluates Task Arithmetic, Linear Averaging, DARE-TIES, DELLA, Breadcrumbs, and our Hierarchical approach across five medical benchmarks. Results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)
