Scrambled Translation Problem: A Problem of Denoising UNMT
Tamali Banerjee, Rudra Murthy V, Pushpak Bhattacharyya

TL;DR
This paper identifies a scrambled translation error in UNMT systems caused by shuffling noise and proposes a retraining strategy that improves translation quality across multiple language pairs.
Contribution
It introduces the scrambled translation problem in UNMT and proposes a simple retraining method to mitigate it, leading to significant performance improvements.
Findings
Retraining after initial training improves BLEU scores.
The method enhances phrase coherence and attention alignment.
Performance gains are consistent across four language pairs.
Abstract
In this paper, we identify an interesting kind of error in the output of Unsupervised Neural Machine Translation (UNMT) systems like \textit{Undreamt}(footnote). We refer to this error type as \textit{Scrambled Translation problem}. We observe that UNMT models which use \textit{word shuffle} noise (as in case of Undreamt) can generate correct words, but fail to stitch them together to form phrases. As a result, words of the translated sentence look \textit{scrambled}, resulting in decreased BLEU. We hypothesise that the reason behind \textit{scrambled translation problem} is 'shuffling noise' which is introduced in every input sentence as a denoising strategy. To test our hypothesis, we experiment by retraining UNMT models with a simple \textit{retraining} strategy. We stop the training of the Denoising UNMT model after a pre-decided number of iterations and resume the training for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsTest
