Modeling Target-Side Morphology in Neural Machine Translation: A Comparison of Strategies
Marion Weller-Di Marco, Matthias Huck, Alexander Fraser

TL;DR
This paper compares strategies for modeling target-side morphology in neural machine translation, showing that linguistic approaches improve out-of-domain translation quality and extend to other language pairs.
Contribution
It evaluates lemma-tag and segmentation strategies for target morphology, demonstrating their effectiveness especially for out-of-domain data and different neural architectures.
Findings
Linguistic modeling benefits out-of-domain translation.
Stronger Transformer models show less improvement in-domain.
Approach successfully applied to English-Czech translation.
Abstract
Morphologically rich languages pose difficulties to machine translation. Machine translation engines that rely on statistical learning from parallel training data, such as state-of-the-art neural systems, face challenges especially with rich morphology on the output language side. Key challenges of rich target-side morphology in data-driven machine translation include: (1) A large amount of differently inflected word surface forms entails a larger vocabulary and thus data sparsity. (2) Some inflected forms of infrequent terms typically do not appear in the training corpus, which makes closed-vocabulary systems unable to generate these unobserved variants. (3) Linguistic agreement requires the system to correctly match the grammatical categories between inflected word forms in the output sentence, both in terms of target-side morpho-syntactic wellformedness and semantic adequacy with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Dense Connections · Byte Pair Encoding · Label Smoothing · Absolute Position Encodings · Layer Normalization · Residual Connection
