Multimodal Multihop Source Retrieval for Web Question Answering

Navya Yarrabelly; Saloni Mittal

arXiv:2501.04173·cs.CL·January 9, 2025

Multimodal Multihop Source Retrieval for Web Question Answering

Navya Yarrabelly, Saloni Mittal

PDF

Open Access

TL;DR

This paper introduces a graph reasoning network that leverages semantic sentence structure to improve multi-modal multi-hop question answering by efficiently retrieving supporting facts across images and text, outperforming transformer baselines.

Contribution

The paper presents a novel graph-based approach that enhances multi-modal multi-hop QA by utilizing semantic graph structures and adjacency matrices, reducing reliance on large transformers.

Findings

01

Graph structure improves retrieval performance.

02

Message propagation can replace large transformers.

03

Achieved 4.6% higher retrieval F1 score.

Abstract

This work deals with the challenge of learning and reasoning over multi-modal multi-hop question answering (QA). We propose a graph reasoning network based on the semantic structure of the sentences to learn multi-source reasoning paths and find the supporting facts across both image and text modalities for answering the question. In this paper, we investigate the importance of graph structure for multi-modal multi-hop question answering. Our analysis is centered on WebQA. We construct a strong baseline model, that finds relevant sources using a pairwise classification task. We establish that, with the proper use of feature representations from pre-trained models, graph structure helps in improving multi-modal multi-hop question answering. We point out that both graph structure and adjacency matrix are task-related prior knowledge, and graph structure can be leveraged to improve the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Expert finding and Q&A systems · Web Data Mining and Analysis