Learning Semantic Representations for the Phrase Translation Model
Jianfeng Gao, Xiaodong He, Wen-tau Yih, and Li Deng

TL;DR
This paper introduces a semantic-based phrase translation model that projects phrases into a semantic space using neural networks, improving machine translation quality by directly optimizing translation scores.
Contribution
It presents a novel neural network approach to embed phrase pairs into a semantic space for translation scoring, enhancing translation performance over existing models.
Findings
Significant BLEU score improvements on Europarl datasets
Effective semantic embedding of phrase pairs for translation
Outperforms state-of-the-art phrase-based translation models
Abstract
This paper presents a novel semantic-based phrase translation model. A pair of source and target phrases are projected into continuous-valued vector representations in a low-dimensional latent semantic space, where their translation score is computed by the distance between the pair in this new space. The projection is performed by a multi-layer neural network whose weights are learned on parallel training data. The learning is aimed to directly optimize the quality of end-to-end machine translation results. Experimental evaluation has been performed on two Europarl translation tasks, English-French and German-English. The results show that the new semantic-based phrase translation model significantly improves the performance of a state-of-the-art phrase-based statistical machine translation sys-tem, leading to a gain of 0.7-1.0 BLEU points.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text and Document Classification Technologies
