Effective Self-Mining of In-Context Examples for Unsupervised Machine Translation with LLMs
Abdellah El Mekki, Muhammad Abdul-Mageed

TL;DR
This paper introduces an unsupervised method for mining in-context examples for machine translation using large language models, improving translation quality in low-resource and multilingual settings without relying on annotated data.
Contribution
The authors propose a novel unsupervised approach to mine and filter in-context examples for machine translation, enhancing performance across multiple languages without supervised data.
Findings
Achieved comparable or better translation quality than supervised in-context samples.
Outperformed existing state-of-the-art unsupervised MT methods by an average of 7 BLEU points.
Effective in low-resource and multilingual translation scenarios.
Abstract
Large Language Models (LLMs) have demonstrated impressive performance on a wide range of natural language processing (NLP) tasks, primarily through in-context learning (ICL). In ICL, the LLM is provided with examples that represent a given task such that it learns to generate answers for test inputs. However, access to these in-context examples is not guaranteed especially for low-resource or massively multilingual tasks. In this work, we propose an unsupervised approach to mine in-context examples for machine translation (MT), enabling unsupervised MT (UMT) across different languages. Our approach begins with word-level mining to acquire word translations that are then used to perform sentence-level mining. As the quality of mined parallel pairs may not be optimal due to noise or mistakes, we introduce a filtering criterion to select the optimal in-context examples from a pool of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Speech and dialogue systems
