A simple discriminative training method for machine translation with   large-scale features

Tian Xia; Shaodan Zhai; Shaojun Wang

arXiv:1909.09491·cs.CL·September 23, 2019

A simple discriminative training method for machine translation with large-scale features

Tian Xia, Shaodan Zhai, Shaojun Wang

PDF

Open Access

TL;DR

This paper introduces a new discriminative training method for statistical machine translation that simplifies implementation while maintaining robustness and effectiveness with large-scale features.

Contribution

A novel training approach that treats N-best lists as permutations and minimizes Plackett-Luce loss, offering an easier-to-implement alternative to MIRAs.

Findings

01

More robust than MERT in experiments

02

Comparable to MIRAs in performance

03

Simpler to implement than MIRAs

Abstract

Margin infused relaxed algorithms (MIRAs) dominate model tuning in statistical machine translation in the case of large scale features, but also they are famous for the complexity in implementation. We introduce a new method, which regards an N-best list as a permutation and minimizes the Plackett-Luce loss of ground-truth permutations. Experiments with large-scale features demonstrate that, the new method is more robust than MERT; though it is only matchable with MIRAs, it has a comparatively advantage, easier to implement.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques