Self-Augmented In-Context Learning for Unsupervised Word Translation

Yaoyiran Li; Anna Korhonen; Ivan Vuli\'c

arXiv:2402.10024·cs.CL·June 6, 2024·2 cites

Self-Augmented In-Context Learning for Unsupervised Word Translation

Yaoyiran Li, Anna Korhonen, Ivan Vuli\'c

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces SAIL, a self-augmented in-context learning method that iteratively improves unsupervised word translation in large language models, surpassing previous approaches and achieving state-of-the-art results.

Contribution

The paper proposes a novel iterative method, SAIL, that enhances unsupervised bilingual lexicon induction using LLMs without seed pairs, outperforming existing methods.

Findings

01

SAIL outperforms zero-shot prompting on BLI benchmarks.

02

SAIL surpasses traditional mapping-based approaches in unsupervised BLI.

03

The method achieves state-of-the-art performance across multiple language pairs.

Abstract

Recent work has shown that, while large language models (LLMs) demonstrate strong word translation or bilingual lexicon induction (BLI) capabilities in few-shot setups, they still cannot match the performance of 'traditional' mapping-based approaches in the unsupervised scenario where no seed translation pairs are available, especially for lower-resource languages. To address this challenge with LLMs, we propose self-augmented in-context learning (SAIL) for unsupervised BLI: starting from a zero-shot prompt, SAIL iteratively induces a set of high-confidence word translation pairs for in-context learning (ICL) from an LLM, which it then reapplies to the same LLM in the ICL fashion. Our method shows substantial gains over zero-shot prompting of LLMs on two established BLI benchmarks spanning a wide range of language pairs, also outperforming mapping-based baselines across the board. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cambridgeltl/sail-bli
pytorchOfficial

Videos

Self-Augmented In-Context Learning for Unsupervised Word Translation· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems

MethodsSparse Evolutionary Training