Machine Translation with Unsupervised Length-Constraints
Jan Niehues

TL;DR
This paper introduces an end-to-end unsupervised approach for length-constrained machine translation, enabling improved translation quality and unsupervised monolingual sentence compression by integrating length constraints directly into the model.
Contribution
It presents a novel unsupervised method for length-constrained translation and compression, combining zero-shot multilingual translation with constraint integration.
Findings
Significant improvement in translation quality under length constraints
Successful unsupervised monolingual sentence compression
Effective integration of length constraints into the translation model
Abstract
We have seen significant improvements in machine translation due to the usage of deep learning. While the improvements in translation quality are impressive, the encoder-decoder architecture enables many more possibilities. In this paper, we explore one of these, the generation of constraint translation. We focus on length constraints, which are essential if the translation should be displayed in a given format. In this work, we propose an end-to-end approach for this task. Compared to a traditional method that first translates and then performs sentence compression, the text compression is learned completely unsupervised. By combining the idea with zero-shot multilingual machine translation, we are also able to perform unsupervised monolingual sentence compression. In order to fulfill the length constraints, we investigated several methods to integrate the constraints into the model.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
