Word-to-Word Models of Translational Equivalence
I. Dan Melamed (University of Pennsylvania)

TL;DR
This paper introduces biasing techniques for statistical translation models that leverage properties of parallel texts, such as one-to-one word translation and noise, resulting in improved accuracy over baseline models.
Contribution
It presents novel biasing methods for translation models based on properties of bitexts and demonstrates how pre-existing knowledge enhances translation accuracy.
Findings
Biased models outperform baseline models in accuracy.
Incorporating language-specific knowledge improves translation performance.
Biasing techniques are effective even with sparse data.
Abstract
Parallel texts (bitexts) have properties that distinguish them from other kinds of parallel data. First, most words translate to only one other word. Second, bitext correspondence is noisy. This article presents methods for biasing statistical translation models to reflect these properties. Analysis of the expected behavior of these biases in the presence of sparse data predicts that they will result in more accurate models. The prediction is confirmed by evaluation with respect to a gold standard -- translation models that are biased in this fashion are significantly more accurate than a baseline knowledge-poor model. This article also shows how a statistical translation model can take advantage of various kinds of pre-existing knowledge that might be available about particular language pairs. Even the simplest kinds of language-specific knowledge, such as the distinction between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
