Dual-Axis Generative Reward Model Toward Semantic and Turn-taking Robustness in Interactive Spoken Dialogue Models

Yifu Chen; Shengpeng Ji; Zhengqing Liu; Qian Chen; Wen Wang; Ziqing Wang; Yangzhuo Li; Tianle Liang; Zhou Zhao

arXiv:2604.14920·cs.AI·April 17, 2026

Dual-Axis Generative Reward Model Toward Semantic and Turn-taking Robustness in Interactive Spoken Dialogue Models

Yifu Chen, Shengpeng Ji, Zhengqing Liu, Qian Chen, Wen Wang, Ziqing Wang, Yangzhuo Li, Tianle Liang, Zhou Zhao

PDF

TL;DR

This paper introduces a Dual-Axis Generative Reward Model for interactive spoken dialogue models, providing reliable, detailed feedback on semantic and timing aspects to improve reinforcement learning-based interaction quality.

Contribution

It presents a novel reward model that offers dual evaluations for semantic and turn-taking quality, addressing the limitations of existing metrics and enabling better RL training for SDMs.

Findings

01

Achieved state-of-the-art performance on interaction-quality assessment.

02

Provided reliable, detailed diagnostic feedback for SDMs.

03

Demonstrated effectiveness across synthetic and real-world datasets.

Abstract

Achieving seamless, human-like interaction remains a key challenge for full-duplex spoken dialogue models (SDMs). Reinforcement learning (RL) has substantially enhanced text- and vision-language models, while well-designed reward signals are crucial for the performance of RL. We consider RL a promising strategy to address the key challenge for SDMs. However, a fundamental barrier persists: prevailing automated metrics for assessing interaction quality rely on superficial proxies, such as behavioral statistics or timing-prediction accuracy, failing to provide reliable reward signals for RL. On the other hand, human evaluations, despite their richness, remain costly, inconsistent, and difficult to scale. We tackle this critical barrier by proposing a Dual-Axis Generative Reward Model, which is trained to understand complex interaction dynamics using a detailed taxonomy and an annotated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.