Lost in the Middle, and In-Between: Enhancing Language Models' Ability   to Reason Over Long Contexts in Multi-Hop QA

George Arthur Baker; Ankush Raut; Sagi Shaier; Lawrence E Hunter,; Katharina von der Wense

arXiv:2412.10079·cs.CL·December 16, 2024

Lost in the Middle, and In-Between: Enhancing Language Models' Ability to Reason Over Long Contexts in Multi-Hop QA

George Arthur Baker, Ankush Raut, Sagi Shaier, Lawrence E Hunter,, Katharina von der Wense

PDF

1 Repo

TL;DR

This paper investigates how long-context language models struggle to utilize information evenly across inputs in multi-hop question answering, especially when relevant info is spread out, and proposes methods to improve reasoning over long contexts.

Contribution

It reveals the 'lost in the middle' bias in multi-hop QA and introduces techniques like knowledge graph extraction, summarization, and chain-of-thought prompting to mitigate this issue.

Findings

01

Performance drops as relevant info moves away from input edges

02

Reducing extraneous content improves model reasoning

03

Chain-of-thought prompting enhances multi-hop reasoning

Abstract

Previous work finds that recent long-context language models fail to make equal use of information in the middle of their inputs, preferring pieces of information located at the tail ends which creates an undue bias in situations where we would like models to be equally capable of using different parts of the input. Thus far, the problem has mainly only been considered in settings with single pieces of critical information, leading us to question what happens when multiple necessary pieces of information are spread out over the inputs. Here, we demonstrate the effects of the "lost in the middle" problem in the multi-hop question answering setting -- in which multiple reasoning "hops" over disconnected documents are required -- and show that performance degrades not only with respect to the distance of information from the edges of the context, but also between pieces of information.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

spongeorge/long-context-multihop
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.