Stay Focused: Problem Drift in Multi-Agent Debate

Jonas Becker; Lars Benedikt Kaesberg; Andreas Stephan; Jan Philip Wahle; Terry Ruas; and Bela Gipp

arXiv:2502.19559·cs.CL·April 10, 2026

Stay Focused: Problem Drift in Multi-Agent Debate

Jonas Becker, Lars Benedikt Kaesberg, Andreas Stephan, Jan Philip Wahle, Terry Ruas, and Bela Gipp

PDF

1 Video

TL;DR

This paper investigates problem drift in multi-agent debate with large language models, analyzing its causes, prevalence, and proposing methods to detect and mitigate it to improve task performance.

Contribution

It introduces the concept of problem drift, quantifies its occurrence across tasks, and proposes DRIFTJudge and DRIFTPolicy to detect and reduce drift in multi-agent debates.

Findings

01

Problem drift occurs in 76-89% of generative tasks due to subjective answer spaces.

02

Eight experts analyzed 170 debates to identify causes of drift, including lack of progress and low-quality feedback.

03

DRIFTPolicy mitigates 31% of problem drift cases.

Abstract

Multi-agent debate - multiple instances of large language models discussing problems in turn-based interaction - has shown promise for solving knowledge and reasoning tasks. However, these methods show limitations when solving complex problems that require longer reasoning chains. We analyze how multi-agent debate drifts away from the initial problem over multiple turns, thus harming task performance. We define this phenomenon as problem drift and quantify its presence across ten tasks (i.e., three generative, three knowledge, three reasoning, and one instruction-following task). We find that generative tasks drift often due to the subjectivity of the answer space (76-89%), compared to high-complexity tasks (7-21%). To identify the reasons, eight human experts analyze 170 multi-agent debates suffering from problem drift. We find the most common issues related to this drift are the lack…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Stay Focused: Problem Drift in Multi-Agent Debate· underline