Dynamic Context-guided Capsule Network for Multimodal Machine   Translation

Huan Lin; Fandong Meng; Jinsong Su; Yongjing Yin and; Zhengyuan Yang; Yubin Ge; Jie Zhou; Jiebo Luo

arXiv:2009.02016·cs.CL·September 7, 2020

Dynamic Context-guided Capsule Network for Multimodal Machine Translation

Huan Lin, Fandong Meng, Jinsong Su, Yongjing Yin and, Zhengyuan Yang, Yubin Ge, Jie Zhou, Jiebo Luo

PDF

1 Repo

TL;DR

This paper introduces a novel Dynamic Context-guided Capsule Network that adaptively models visual features at different granularities to improve multimodal machine translation, outperforming existing methods.

Contribution

The paper proposes a dynamic routing mechanism within capsule networks to better utilize visual context in multimodal translation, addressing limitations of fixed context models.

Findings

01

DCCN outperforms baseline models on Multi30K dataset

02

The model effectively integrates global and regional visual features

03

Experimental results show improved translation quality

Abstract

Multimodal machine translation (MMT), which mainly focuses on enhancing text-only translation with visual features, has attracted considerable attention from both computer vision and natural language processing communities. Most current MMT models resort to attention mechanism, global context modeling or multimodal joint representation learning to utilize visual features. However, the attention mechanism lacks sufficient semantic interactions between modalities while the other two provide fixed visual context, which is unsuitable for modeling the observed variability when generating translation. To address the above issues, in this paper, we propose a novel Dynamic Context-guided Capsule Network (DCCN) for MMT. Specifically, at each timestep of decoding, we first employ the conventional source-target attention to produce a timestep-specific source-side context vector. Next, DCCN takes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

DeepLearnXMU/MM-DCCN
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsCapsule Network