A Word-to-Word Model of Translational Equivalence
I. Dan Melamed (University of Pennsylvania)

TL;DR
This paper introduces a fast, adjustable, word-level translation model that efficiently produces high-accuracy translation lexicons, suitable for applications with limited computational resources.
Contribution
The authors present a novel, efficient algorithm for estimating a partial translation model that can produce dictionary-sized lexicons with over 99% accuracy, and easily incorporate external knowledge.
Findings
Achieves over 99% accuracy in translation lexicon induction
Provides a controllable precision/recall trade-off via a single threshold
Can incorporate external linguistic information easily
Abstract
Many multilingual NLP applications need to translate words between different languages, but cannot afford the computational expense of inducing or applying a full translation model. For these applications, we have designed a fast algorithm for estimating a partial translation model, which accounts for translational equivalence only at the word level. The model's precision/recall trade-off can be directly controlled via one threshold parameter. This feature makes the model more suitable for applications that are not fully statistical. The model's hidden parameters can be easily conditioned on information extrinsic to the model, providing an easy way to integrate pre-existing knowledge such as part-of-speech, dictionaries, word order, etc.. Our model can link word tokens in parallel texts as well as other translation models in the literature. Unlike other translation models, it can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
