Phrase Based Language Model for Statistical Machine Translation: Empirical Study
Geliang Chen

TL;DR
This paper introduces two phrase-based language models for statistical machine translation that improve reordering and translation quality over traditional word-based models, demonstrated through empirical experiments.
Contribution
The paper presents novel phrase-based language models specifically designed for reordering in machine translation, outperforming word-based models in empirical tests.
Findings
Phrase-based LMs outperform word-based LMs in perplexity.
Phrase-based LMs improve n-best list re-ranking.
Empirical results validate the effectiveness of phrase-based models.
Abstract
Reordering is a challenge to machine translation (MT) systems. In MT, the widely used approach is to apply word based language model (LM) which considers the constituent units of a sentence as words. In speech recognition (SR), some phrase based LM have been proposed. However, those LMs are not necessarily suitable or optimal for reordering. We propose two phrase based LMs which considers the constituent units of a sentence as phrases. Experiments show that our phrase based LMs outperform the word based LM with the respect of perplexity and n-best list re-ranking.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text and Document Classification Technologies · Topic Modeling
