Specializing Multi-domain NMT via Penalizing Low Mutual Information

Jiyoung Lee; Hantae Kim; Hyunchang Cho; Edward Choi; and Cheonbok Park

arXiv:2210.12910·cs.CL·October 25, 2022·1 cites

Specializing Multi-domain NMT via Penalizing Low Mutual Information

Jiyoung Lee, Hantae Kim, Hyunchang Cho, Edward Choi, and Cheonbok Park

PDF

Open Access

TL;DR

This paper proposes a novel objective for multi-domain neural machine translation that penalizes low mutual information to enhance domain-specific learning, achieving state-of-the-art results.

Contribution

It introduces a mutual information-based penalty to improve domain specialization in multi-domain NMT models, which is a new approach in this area.

Findings

01

Achieved state-of-the-art performance on multi-domain NMT benchmarks.

02

Empirically demonstrated increased mutual information leads to better domain specialization.

03

The proposed method outperforms existing models in handling multiple domains.

Abstract

Multi-domain Neural Machine Translation (NMT) trains a single model with multiple domains. It is appealing because of its efficacy in handling multiple domains within one model. An ideal multi-domain NMT should learn distinctive domain characteristics simultaneously, however, grasping the domain peculiarity is a non-trivial task. In this paper, we investigate domain-specific information through the lens of mutual information (MI) and propose a new objective that penalizes low MI to become higher. Our method achieved the state-of-the-art performance among the current competitive multi-domain NMT models. Also, we empirically show our objective promotes low MI to be higher resulting in domain-specialized multi-domain NMT.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling