AdaMergeX: Cross-Lingual Transfer with Large Language Models via   Adaptive Adapter Merging

Yiran Zhao; Wenxuan Zhang; Huiming Wang; Kenji Kawaguchi; Lidong Bing

arXiv:2402.18913·cs.CL·March 1, 2024·2 cites

AdaMergeX: Cross-Lingual Transfer with Large Language Models via Adaptive Adapter Merging

Yiran Zhao, Wenxuan Zhang, Huiming Wang, Kenji Kawaguchi, Lidong Bing

PDF

Open Access 1 Repo 1 Video

TL;DR

AdaMergeX introduces an adaptive adapter merging technique for cross-lingual transfer with large language models, effectively addressing language and task divergence to improve transfer performance across languages.

Contribution

The paper proposes AdaMergeX, a novel method that uses adaptive adapter merging based on a reference task to enhance cross-lingual transfer in large language models.

Findings

01

Outperforms existing cross-lingual transfer methods across various settings.

02

Effectively separates task ability from language ability in transfer.

03

Demonstrates robustness across multiple languages and tasks.

Abstract

As an effective alternative to the direct fine-tuning on target tasks in specific languages, cross-lingual transfer addresses the challenges of limited training data by decoupling ''task ability'' and ''language ability'' by fine-tuning on the target task in the source language and another selected task in the target language, respectively. However, they fail to fully separate the task ability from the source language or the language ability from the chosen task. In this paper, we acknowledge the mutual reliance between task ability and language ability and direct our attention toward the gap between the target language and the source language on tasks. As the gap removes the impact of tasks, we assume that it remains consistent across tasks. Based on this assumption, we propose a new cross-lingual transfer method called $AdaMergeX$ that utilizes adaptive adapter merging. By…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

damo-nlp-sg/adamergex
pytorchOfficial

Videos

AdaMergeX: Cross-Lingual Transfer with Large Language Models via Adaptive Adapter Merging· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

MethodsAdapter