TL;DR
This paper introduces a flexible edit-based approach for bilingual synchronization, transforming initial target sequences into accurate translations, and demonstrates its effectiveness across various practical settings compared to specialized systems.
Contribution
It presents a novel, generic edit-based system for bilingual synchronization that can match or outperform dedicated translation systems after fine-tuning.
Findings
Single edit-based system performs well across tasks
Fine-tuning improves translation accuracy
Outperforms dedicated systems in experiments
Abstract
Machine Translation (MT) is usually viewed as a one-shot process that generates the target language equivalent of some source text from scratch. We consider here a more general setting which assumes an initial target sequence, that must be transformed into a valid translation of the source, thereby restoring parallelism between source and target. For this bilingual synchronization task, we consider several architectures (both autoregressive and non-autoregressive) and training regimes, and experiment with multiple practical settings such as simulated interactive MT, translating with Translation Memory (TM) and TM cleaning. Our results suggest that one single generic edit-based system, once fine-tuned, can compare with, or even outperform, dedicated systems specifically trained for these tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
