Masking in Multi-hop QA: An Analysis of How Language Models Perform with Context Permutation
Wenyu Huang, Pavlos Vougiouklis, Mirella Lapata, Jeff Z. Pan

TL;DR
This paper analyzes how different language models perform on multi-hop question answering when the order of context documents is permuted, revealing model differences, the impact of document order, and improvements via attention modifications.
Contribution
It provides a comparative analysis of encoder-decoder and decoder-only models in multi-hop QA, explores the effects of document permutation, and proposes attention-based enhancements.
Findings
Encoder-decoder models outperform decoder-only models in MHQA.
Document order aligned with reasoning improves performance.
Bi-directional attention boosts causal decoder-only models.
Abstract
Multi-hop Question Answering (MHQA) adds layers of complexity to question answering, making it more challenging. When Language Models (LMs) are prompted with multiple search results, they are tasked not only with retrieving relevant information but also employing multi-hop reasoning across the information sources. Although LMs perform well on traditional question-answering tasks, the causal mask can hinder their capacity to reason across complex contexts. In this paper, we explore how LMs respond to multi-hop questions by permuting search results (retrieved documents) under various configurations. Our study reveals interesting findings as follows: 1) Encoder-decoder models, such as the ones in the Flan-T5 family, generally outperform causal decoder-only LMs in MHQA tasks, despite being significantly smaller in size; 2) altering the order of gold documents reveals distinct trends in both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Information Retrieval and Search Behavior
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Softmax · Attention Dropout · SentencePiece · Residual Connection · Linear Layer · Dropout · Inverse Square Root Schedule
