CycleDistill: Bootstrapping Machine Translation using LLMs with Cyclical Distillation

Deepon Halder; Thanmay Jayakumar; Raj Dabre

arXiv:2506.19952·cs.CL·August 12, 2025

CycleDistill: Bootstrapping Machine Translation using LLMs with Cyclical Distillation

Deepon Halder, Thanmay Jayakumar, Raj Dabre

PDF

Open Access 1 Repo

TL;DR

CycleDistill is a novel bootstrapping method that uses LLMs and cyclical distillation to improve machine translation for low-resource languages without requiring extensive parallel data.

Contribution

It introduces a cyclical distillation approach that leverages LLMs and minimal monolingual data to enhance translation quality in low-resource settings.

Findings

01

Achieves 20-30 chrF points improvement over baseline

02

Effective with only 1-4 few-shot examples

03

Softmax activation during distillation yields mild gains

Abstract

Large language models (LLMs), despite their ability to perform few-shot machine translation (MT), often lag behind dedicated MT systems trained on parallel corpora, which are crucial for high quality machine translation (MT). However, parallel corpora are often scarce or non-existent for low-resource languages. In this paper, we propose CycleDistill, a bootstrapping approach leveraging LLMs and few-shot translation to obtain high-quality MT systems. CycleDistill involves iteratively generating synthetic parallel corpora from monolingual corpora via zero- or few-shot MT, which is then used to fine-tune the model that was used for generating said data for MT. CycleDistill does not need parallel corpora beyond 1 to 4 few-shot examples, and in our experiments focusing on three Indian languages, by relying solely on monolingual corpora, it can achieve high-quality machine translation,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

deeps73/CycleDistill
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsSoftmax