Beyond Turn-Based Interfaces: Synchronous LLMs as Full-Duplex Dialogue Agents
Bandhav Veluri, Benjamin N Peloquin, Bokai Yu, Hongyu Gong, Shyamnath, Gollakota

TL;DR
This paper introduces Synchronous LLMs that incorporate real-world time into their architecture, enabling full-duplex spoken dialogue with natural turn-taking, overlapping speech, and backchanneling, surpassing traditional turn-based models.
Contribution
The authors propose a novel mechanism to embed time information into Llama3-8b, enabling synchronous operation and full-duplex dialogue modeling, along with a training recipe using synthetic data to enhance naturalness.
Findings
Outperform state-of-the-art in dialogue meaningfulness
Maintain naturalness in spoken dialogue
Demonstrate full-duplex interaction between agents with latency considerations
Abstract
Despite broad interest in modeling spoken dialogue agents, most approaches are inherently "half-duplex" -- restricted to turn-based interaction with responses requiring explicit prompting by the user or implicit tracking of interruption or silence events. Human dialogue, by contrast, is "full-duplex" allowing for rich synchronicity in the form of quick and dynamic turn-taking, overlapping speech, and backchanneling. Technically, the challenge of achieving full-duplex dialogue with LLMs lies in modeling synchrony as pre-trained LLMs do not have a sense of "time". To bridge this gap, we propose Synchronous LLMs for full-duplex spoken dialogue modeling. We design a novel mechanism to integrate time information into Llama3-8b so that they run synchronously with the real-world clock. We also introduce a training recipe that uses 212k hours of synthetic spoken dialogue data generated from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSpeech and dialogue systems · Service-Oriented Architecture and Web Services · Multi-Agent Systems and Negotiation
