Phrase Based Language Model for Statistical Machine Translation:   Empirical Study

Geliang Chen

arXiv:1501.05203·cs.CL·February 19, 2015

Phrase Based Language Model for Statistical Machine Translation: Empirical Study

Geliang Chen

PDF

Open Access

TL;DR

This paper introduces two phrase-based language models for statistical machine translation that improve reordering and translation quality over traditional word-based models, demonstrated through empirical experiments.

Contribution

The paper presents novel phrase-based language models specifically designed for reordering in machine translation, outperforming word-based models in empirical tests.

Findings

01

Phrase-based LMs outperform word-based LMs in perplexity.

02

Phrase-based LMs improve n-best list re-ranking.

03

Empirical results validate the effectiveness of phrase-based models.

Abstract

Reordering is a challenge to machine translation (MT) systems. In MT, the widely used approach is to apply word based language model (LM) which considers the constituent units of a sentence as words. In speech recognition (SR), some phrase based LM have been proposed. However, those LMs are not necessarily suitable or optimal for reordering. We propose two phrase based LMs which considers the constituent units of a sentence as phrases. Experiments show that our phrase based LMs outperform the word based LM with the respect of perplexity and n-best list re-ranking.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text and Document Classification Technologies · Topic Modeling