Speak or Stay Silent: Context-Aware Turn-Taking in Multi-Party Dialogue

Kratika Bhagtani; Mrinal Anand; Yu Chen Xu; Amit Kumar Singh Yadav

arXiv:2603.11409·cs.AI·March 13, 2026

Speak or Stay Silent: Context-Aware Turn-Taking in Multi-Party Dialogue

Kratika Bhagtani, Mrinal Anand, Yu Chen Xu, Amit Kumar Singh Yadav

PDF

Open Access 10 Models 1 Datasets

TL;DR

This paper addresses the challenge of turn-taking in multi-party voice conversations, proposing a context-aware method that improves AI assistant behavior by explicitly training for appropriate speaking decisions.

Contribution

The paper introduces a new benchmark dataset and a supervised fine-tuning approach with reasoning traces to enable context-aware turn-taking in multi-party dialogue systems.

Findings

01

Large language models fail at zero-shot turn-taking in multi-party settings.

02

Supervised fine-tuning with reasoning traces significantly improves turn-taking accuracy.

03

Explicit training is necessary for effective context-aware turn-taking, as it is not an emergent capability.

Abstract

Existing voice AI assistants treat every detected pause as an invitation to speak. This works in dyadic dialogue, but in multi-party settings, where an AI assistant participates alongside multiple speakers, pauses are abundant and ambiguous. An assistant that speaks on every pause becomes disruptive rather than useful. In this work, we formulate context-aware turn-taking: at every detected pause, given the full conversation context, our method decides whether the assistant should speak or stay silent. We introduce a benchmark of over 120K labeled conversations spanning three multi-party corpora. Evaluating eight recent large language models, we find that they consistently fail at context-aware turn-taking under zero-shot prompting. We then propose a supervised fine-tuning approach with reasoning traces, improving balanced accuracy by up to 23 percentage points. Our findings suggest that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

ishiki-labs/multi-party-dialogue
dataset· 79 dl
79 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · AI in Service Interactions