On the Analysis of Cross-Lingual Prompt Tuning for Decoder-based   Multilingual Model

Nohil Park; Joonsuk Park; Kang Min Yoo; Sungroh Yoon

arXiv:2311.07820·cs.CL·November 15, 2023·1 cites

On the Analysis of Cross-Lingual Prompt Tuning for Decoder-based Multilingual Model

Nohil Park, Joonsuk Park, Kang Min Yoo, Sungroh Yoon

PDF

Open Access

TL;DR

This paper compares prompt tuning and fine-tuning on multilingual autoregressive models, finding prompt tuning often outperforms fine-tuning, especially for low-resource languages, with minimal parameter updates.

Contribution

It provides the first comprehensive comparison of prompt tuning versus fine-tuning on decoder-based multilingual models across multiple cross-lingual tasks.

Findings

01

Prompt tuning matches or exceeds fine-tuning performance.

02

Prompt tuning updates only 0.13% of model parameters.

03

Prompt tuning is more effective for low-resource languages.

Abstract

An exciting advancement in the field of multilingual models is the emergence of autoregressive models with zero- and few-shot capabilities, a phenomenon widely reported in large-scale language models. To further improve model adaptation to cross-lingual tasks, another trend is to further fine-tune the language models with either full fine-tuning or parameter-efficient tuning. However, the interaction between parameter-efficient fine-tuning (PEFT) and cross-lingual tasks in multilingual autoregressive models has yet to be studied. Specifically, we lack an understanding of the role of linguistic distributions in multilingual models in the effectiveness of token-based prompt tuning. To address this question, we conduct experiments comparing prompt tuning and fine-tuning on the decoder-based multilingual model, XGLM, with four cross-lingual tasks (XNLI, PAWS-X, POS, NER). According to our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis