AdaMergeX: Cross-Lingual Transfer with Large Language Models via Adaptive Adapter Merging
Yiran Zhao, Wenxuan Zhang, Huiming Wang, Kenji Kawaguchi, Lidong Bing

TL;DR
AdaMergeX introduces an adaptive adapter merging technique for cross-lingual transfer with large language models, effectively addressing language and task divergence to improve transfer performance across languages.
Contribution
The paper proposes AdaMergeX, a novel method that uses adaptive adapter merging based on a reference task to enhance cross-lingual transfer in large language models.
Findings
Outperforms existing cross-lingual transfer methods across various settings.
Effectively separates task ability from language ability in transfer.
Demonstrates robustness across multiple languages and tasks.
Abstract
As an effective alternative to the direct fine-tuning on target tasks in specific languages, cross-lingual transfer addresses the challenges of limited training data by decoupling ''task ability'' and ''language ability'' by fine-tuning on the target task in the source language and another selected task in the target language, respectively. However, they fail to fully separate the task ability from the source language or the language ability from the chosen task. In this paper, we acknowledge the mutual reliance between task ability and language ability and direct our attention toward the gap between the target language and the source language on tasks. As the gap removes the impact of tasks, we assume that it remains consistent across tasks. Based on this assumption, we propose a new cross-lingual transfer method called that utilizes adaptive adapter merging. By…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
MethodsAdapter
