A Word-to-Word Model of Translational Equivalence

I. Dan Melamed (University of Pennsylvania)

arXiv:cmp-lg/9706026·cmp-lg·February 3, 2008·23 cites

A Word-to-Word Model of Translational Equivalence

I. Dan Melamed (University of Pennsylvania)

PDF

Open Access

TL;DR

This paper introduces a fast, adjustable, word-level translation model that efficiently produces high-accuracy translation lexicons, suitable for applications with limited computational resources.

Contribution

The authors present a novel, efficient algorithm for estimating a partial translation model that can produce dictionary-sized lexicons with over 99% accuracy, and easily incorporate external knowledge.

Findings

01

Achieves over 99% accuracy in translation lexicon induction

02

Provides a controllable precision/recall trade-off via a single threshold

03

Can incorporate external linguistic information easily

Abstract

Many multilingual NLP applications need to translate words between different languages, but cannot afford the computational expense of inducing or applying a full translation model. For these applications, we have designed a fast algorithm for estimating a partial translation model, which accounts for translational equivalence only at the word level. The model's precision/recall trade-off can be directly controlled via one threshold parameter. This feature makes the model more suitable for applications that are not fully statistical. The model's hidden parameters can be easily conditioned on information extrinsic to the model, providing an easy way to integrate pre-existing knowledge such as part-of-speech, dictionaries, word order, etc.. Our model can link word tokens in parallel texts as well as other translation models in the literature. Unlike other translation models, it can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification