Learning Multimodal Word Representation via Dynamic Fusion Methods

Shaonan Wang; Jiajun Zhang; Chengqing Zong

arXiv:1801.00532·cs.CL·January 3, 2018·6 cites

Learning Multimodal Word Representation via Dynamic Fusion Methods

Shaonan Wang, Jiajun Zhang, Chengqing Zong

PDF

Open Access

TL;DR

This paper introduces three dynamic fusion methods for multimodal word representations, allowing the model to adaptively weight different modalities based on word types, leading to improved semantic understanding.

Contribution

It proposes novel dynamic fusion techniques that assign importance weights to modalities, enhancing multimodal word representations over existing models.

Findings

01

Proposed methods outperform unimodal baselines.

02

Proposed methods outperform state-of-the-art multimodal models.

03

Dynamic weighting improves semantic representation quality.

Abstract

Multimodal models have been proven to outperform text-based models on learning semantic word representations. Almost all previous multimodal models typically treat the representations from different modalities equally. However, it is obvious that information from different modalities contributes differently to the meaning of words. This motivates us to build a multimodal model that can dynamically fuse the semantic representations from different modalities according to different types of words. To that end, we propose three novel dynamic fusion methods to assign importance weights to each modality, in which weights are learned under the weak supervision of word association pairs. The extensive experiments have demonstrated that the proposed methods outperform strong unimodal baselines and state-of-the-art multimodal models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining