Adapting Visual Question Answering Models for Enhancing Multimodal   Community Q&A Platforms

Avikalp Srivastava; Hsin Wen Liu; Sumio Fujita

arXiv:1808.09648·cs.CL·May 28, 2019

Adapting Visual Question Answering Models for Enhancing Multimodal Community Q&A Platforms

Avikalp Srivastava, Hsin Wen Liu, Sumio Fujita

PDF

Open Access 1 Repo

TL;DR

This paper extends visual question answering models to multimodal community Q&A platforms, improving question categorization and expert retrieval by leveraging image data, and introduces novel attention augmentations for better performance.

Contribution

It is the first to adapt VQA models for multimodal CQA tasks, addressing the challenge of integrating images into community question answering systems.

Findings

01

Model outperforms text-only baselines in classification and retrieval.

02

Augmented attention methods improve grounding of visual information.

03

First application of VQA models to real-world multimodal CQA data.

Abstract

Question categorization and expert retrieval methods have been crucial for information organization and accessibility in community question & answering (CQA) platforms. Research in this area, however, has dealt with only the text modality. With the increasing multimodal nature of web content, we focus on extending these methods for CQA questions accompanied by images. Specifically, we leverage the success of representation learning for text and images in the visual question answering (VQA) domain, and adapt the underlying concept and architecture for automated category classification and expert retrieval on image-based questions posted on Yahoo! Chiebukuro, the Japanese counterpart of Yahoo! Answers. To the best of our knowledge, this is the first work to tackle the multimodality challenge in CQA, and to adapt VQA models for tasks on a more ecologically valid source of visual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

avikalp7/VQAtoCQA
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Expert finding and Q&A systems · Domain Adaptation and Few-Shot Learning