EchoChain: A Full-Duplex Benchmark for State-Update Reasoning Under Interruptions

Smit Nautambhai Modi; Gandharv Mahajan; Marc Wetter; Randall Welles

arXiv:2604.16456·cs.CL·April 21, 2026

EchoChain: A Full-Duplex Benchmark for State-Update Reasoning Under Interruptions

Smit Nautambhai Modi, Gandharv Mahajan, Marc Wetter, Randall Welles

PDF

TL;DR

EchoChain is a benchmark designed to evaluate how well voice assistants update their task state during mid-speech interruptions, revealing significant room for improvement in real-time state reasoning.

Contribution

It introduces a controlled, scenario-driven benchmark for assessing full-duplex state-update reasoning under interruptions, identifying common failure patterns.

Findings

01

Total failures decrease by 40.2% with half-duplex control.

02

No evaluated model exceeds 50% pass rate.

03

Many errors stem from state-update reasoning rather than task difficulty.

Abstract

Real-time voice assistants must revise task state when users interrupt mid-response, but existing spoken-dialog benchmarks largely evaluate turn-based interaction and miss this failure mode. We introduce EchoChain, a controlled benchmark for evaluating full-duplex state-update reasoning under mid-speech interruptions. EchoChain identifies three recurring failure patterns in post-interruption continuations: contextual inertia, interruption amnesia, and objective displacement. The benchmark generates scenario-driven conversations and injects interruptions at a standardized point relative to assistant speech onset, enabling controlled cross-model comparison. In a paired half-duplex control, total failures drop by 40.2% relative to interrupted runs, indicating that many errors are driven by state-update reasoning under interruption rather than task difficulty alone. Across evaluated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.