Counter-Interference Adapter for Multilingual Machine Translation

Yaoming Zhu; Jiangtao Feng; Chengqi Zhao; Mingxuan Wang; Lei Li

arXiv:2104.08154·cs.CL·September 14, 2021

Counter-Interference Adapter for Multilingual Machine Translation

Yaoming Zhu, Jiangtao Feng, Chengqi Zhao, Mingxuan Wang, Lei Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces CIAT, a modified Transformer model that reduces interference in multilingual machine translation, significantly improving performance across numerous language pairs compared to existing models.

Contribution

The paper presents CIAT, a novel adapter for Transformer models that mitigates interference in multilingual translation with minimal additional parameters.

Findings

01

Outperforms strong baselines on 64 of 66 language directions

02

42 language pairs see over 0.5 BLEU improvement

03

Consistent performance gains across multiple benchmark datasets

Abstract

Developing a unified multilingual model has long been a pursuit for machine translation. However, existing approaches suffer from performance degradation -- a single multilingual model is inferior to separately trained bilingual ones on rich-resource languages. We conjecture that such a phenomenon is due to interference caused by joint training with multiple languages. To accommodate the issue, we propose CIAT, an adapted Transformer model with a small parameter overhead for multilingual machine translation. We evaluate CIAT on multiple benchmark datasets, including IWSLT, OPUS-100, and WMT. Experiments show that CIAT consistently outperforms strong multilingual baselines on 64 of total 66 language directions, 42 of which see above 0.5 BLEU improvement. Our code is available at \url{https://github.com/Yaoming95/CIAT}~.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yaoming95/ciat
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Label Smoothing · Layer Normalization · Residual Connection · Multi-Head Attention · Byte Pair Encoding