Semi-supervised Visual Feature Integration for Pre-trained Language   Models

Lisai Zhang; Qingcai Chen; Dongfang Li; Buzhou Tang

arXiv:1912.00336·cs.CL·August 14, 2020

Semi-supervised Visual Feature Integration for Pre-trained Language Models

Lisai Zhang, Qingcai Chen, Dongfang Li, Buzhou Tang

PDF

Open Access

TL;DR

This paper introduces a semi-supervised visual feature integration method for pre-trained language models that improves natural language understanding tasks without needing aligned images for each sentence.

Contribution

The proposed framework allows visual features to be integrated into language models without requiring aligned image-sentence pairs, using a visualization and fusion mechanism.

Findings

01

Improves performance on natural language inference tasks

02

Enhances reading comprehension accuracy

03

Operates efficiently with only an image database

Abstract

Integrating visual features has been proved useful for natural language understanding tasks. Nevertheless, in most existing multimodal language models, the alignment of visual and textual data is expensive. In this paper, we propose a novel semi-supervised visual integration framework for pre-trained language models. In the framework, the visual features are obtained through a visualization and fusion mechanism. The uniqueness includes: 1) the integration is conducted via a semi-supervised approach, which does not require aligned images for every sentences 2) the visual features are integrated as an external component and can be directly used by pre-trained language models. To verify the efficacy of the proposed framework, we conduct the experiments on both natural language inference and reading comprehension tasks. The results demonstrate that our mechanism brings improvement to two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling