Neural Machine Translation for Coptic-French: Strategies for Low-Resource Ancient Languages
Nasma Chaoui, Richard Khoury

TL;DR
This study explores methods for translating Coptic to French, focusing on low-resource settings, by evaluating various translation strategies and training techniques to improve quality and robustness.
Contribution
It is the first systematic investigation into Coptic-French translation strategies, highlighting effective fine-tuning and noise-aware training for historical language translation.
Findings
Fine-tuning with diverse and noise-aware data improves translation quality.
Pivot versus direct translation strategies were systematically compared.
The study offers practical insights for developing translation tools for historical languages.
Abstract
This paper presents the first systematic study of strategies for translating Coptic into French. Our comprehensive pipeline systematically evaluates: pivot versus direct translation, the impact of pre-training, the benefits of multi-version fine-tuning, and model robustness to noise. Utilizing aligned biblical corpora, we demonstrate that fine-tuning with a stylistically-varied and noise-aware training corpus significantly enhances translation quality. Our findings provide crucial practical insights for developing translation tools for historical languages in general.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
