Bilingual is At Least Monolingual (BALM): A Novel Translation Algorithm that Encodes Monolingual Priors
Jeffrey Cheng, Chris Callison-Burch

TL;DR
BALM introduces a novel translation framework that leverages monolingual priors and BERT embeddings, enabling simpler models to achieve near state-of-the-art performance in English-German translation.
Contribution
The paper presents BALM, a new translation algorithm that encodes monolingual priors into an MT pipeline using BERT, simplifying the model architecture while maintaining high translation quality.
Findings
English-German translation with BALM achieves near-SOTA BLEU scores.
A simple feedforward network suffices under the BALM framework.
Monolingual priors significantly improve translation performance.
Abstract
State-of-the-art machine translation (MT) models do not use knowledge of any single language's structure; this is the equivalent of asking someone to translate from English to German while knowing neither language. BALM is a framework incorporates monolingual priors into an MT pipeline; by casting input and output languages into embedded space using BERT, we can solve machine translation with much simpler models. We find that English-to-German translation on the Multi30k dataset can be solved with a simple feedforward network under the BALM framework with near-SOTA BLEU scores.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsLinear Layer · Weight Decay · Residual Connection · Adam · Layer Normalization · Softmax · Attention Is All You Need · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention
