Predicting Anchored Text from Translation Memories for Machine Translation Using Deep Learning Methods
Richard Yue, John E. Ortega

TL;DR
This paper explores using deep learning models like Word2Vec, BERT, and ChatGPT to predict anchored words in translation memories, aiming to improve machine translation quality for CAT tools.
Contribution
It demonstrates that deep learning techniques can effectively predict anchored words, offering an alternative to neural machine translation in translation memory repair.
Findings
Word2Vec, BERT, and ChatGPT achieve comparable or better results than neural machine translation.
Deep learning models effectively predict anchored words in translation memories.
The approach enhances translation memory repair for French-English translation.
Abstract
Translation memories (TMs) are the backbone for professional translation tools called computer-aided translation (CAT) tools. In order to perform a translation using a CAT tool, a translator uses the TM to gather translations similar to the desired segment to translate (s'). Many CAT tools offer a fuzzy-match algorithm to locate segments (s) in the TM that are close in distance to s'. After locating two similar segments, the CAT tool will present parallel segments (s, t) that contain one segment in the source language along with its translation in the target language. Additionally, CAT tools contain fuzzy-match repair (FMR) techniques that will automatically use the parallel segments from the TM to create new TM entries containing a modified version of the original with the idea in mind that it will be the translation of s'. Most FMR techniques use machine translation as a way of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsAttention Is All You Need · Attention Dropout · Linear Warmup With Linear Decay · Linear Layer · Weight Decay · Position-Wise Feed-Forward Layer · Label Smoothing · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Byte Pair Encoding
