Improving Rare Word Translation With Dictionaries and Attention Masking
Kenneth J. Sible, David Chiang

TL;DR
This paper enhances machine translation of rare words by integrating dictionary definitions and attention masking, significantly improving translation quality especially in low-resource and out-of-domain scenarios.
Contribution
It introduces a novel method of appending dictionary definitions and applying attention masking to better translate rare words in neural machine translation.
Findings
Up to 1.0 BLEU improvement
Up to 1.6 MacroF1 gain
Effective in low-resource and out-of-domain settings
Abstract
In machine translation, rare words continue to be a problem for the dominant encoder-decoder architecture, especially in low-resource and out-of-domain translation settings. Human translators solve this problem with monolingual or bilingual dictionaries. In this paper, we propose appending definitions from a bilingual dictionary to source sentences and using attention masking to link together rare words with their definitions. We find that including definitions for rare words improves performance by up to 1.0 BLEU and 1.6 MacroF1.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Second Language Acquisition and Learning
MethodsSoftmax · Attention Is All You Need
