Teaching Large Language Models an Unseen Language on the Fly

Chen Zhang; Xiao Liu; Jiuheng Lin; Yansong Feng

arXiv:2402.19167·cs.CL·June 14, 2024·1 cites

Teaching Large Language Models an Unseen Language on the Fly

Chen Zhang, Xiao Liu, Jiuheng Lin, Yansong Feng

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper explores how large language models can learn an entirely new language on the fly through prompting, using minimal data, to support low-resource and endangered languages.

Contribution

Introduces DiPMT++, a novel framework enabling LLMs to adapt to unseen languages via in-context learning with minimal data, demonstrated on Zhuang and Kalamang languages.

Findings

01

GPT-4 improves from 0 to 16 BLEU in Chinese-Zhuang translation

02

Achieves 32 BLEU in Zhuang-Chinese translation

03

Framework aids human translation of unseen languages

Abstract

Existing large language models struggle to support numerous low-resource languages, particularly the extremely low-resource ones, for which there is minimal training data available for effective parameter updating. We thus investigate whether LLMs can learn a new language on the fly solely through prompting. To study this question, we collect a research suite for Zhuang, a language supported by no LLMs currently. We introduce DiPMT++, a framework for adapting LLMs to unseen languages by in-context learning. Using a dictionary and 5K parallel sentences only, DiPMT++ significantly enhances the performance of GPT-4 from 0 to 16 BLEU for Chinese-to-Zhuang translation and achieves 32 BLEU for Zhuang-to-Chinese translation. We also validate the effectiveness of our framework on Kalamang, another unseen language. Furthermore, we demonstrate the practical utility of DiPMT++ in aiding humans in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

luciusssss/zhuangbench
noneOfficial

Videos

Teaching Large Language Models an Unseen Language on the Fly· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsAttention Is All You Need · Linear Layer · Dropout · Layer Normalization · Byte Pair Encoding · Multi-Head Attention · Softmax · Dense Connections · Label Smoothing · Adam