OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction

Haonan Zhang; Run Luo; Xiong Liu; Yuchuan Wu; Ting-En Lin; Pengpeng Zeng; Qiang Qu; Feiteng Fang; Min Yang; Lianli Gao; Jingkuan Song; Fei Huang; Yongbin Li

arXiv:2505.20277·cs.CL·June 6, 2025

OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction

Haonan Zhang, Run Luo, Xiong Liu, Yuchuan Wu, Ting-En Lin, Pengpeng Zeng, Qiang Qu, Feiteng Fang, Min Yang, Lianli Gao, Jingkuan Song, Fei Huang, Yongbin Li

PDF

Open Access 1 Repo 1 Models 1 Datasets 1 Video

TL;DR

OmniCharacter introduces a novel speech-language personality interaction model for immersive role-playing agents, enabling consistent vocal and dialogue traits with low latency, supported by a new extensive dataset and outperforming existing models.

Contribution

The paper presents OmniCharacter, the first seamless speech-language personality interaction model for RPAs, along with a new dataset OmniCharacter-10K, enhancing realism and immersion in AI role-playing scenarios.

Findings

01

Better response content and style compared to existing models

02

Achieves low latency of 289ms in responses

03

Supports diverse characters with rich contextual dialogues

Abstract

Role-Playing Agents (RPAs), benefiting from large language models, is an emerging interactive AI system that simulates roles or characters with diverse personalities. However, existing methods primarily focus on mimicking dialogues among roles in textual form, neglecting the role's voice traits (e.g., voice style and emotions) as playing a crucial effect in interaction, which tends to be more immersive experiences in realistic scenarios. Towards this goal, we propose OmniCharacter, a first seamless speech-language personality interaction model to achieve immersive RPAs with low latency. Specifically, OmniCharacter enables agents to consistently exhibit role-specific personality traits and vocal traits throughout the interaction, enabling a mixture of speech and language responses. To align the model with speech-language scenarios, we construct a dataset named OmniCharacter-10K, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alibabaresearch/damo-convai
pytorchOfficial

Models

🤗
Tongyi-ConvAI/OmniCharacter-7B
model

Datasets

Tongyi-ConvAI/OmniCharacter
dataset· 204 dl
204 dl

Videos

OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction· underline

Taxonomy

TopicsSpeech and dialogue systems · Social Robot Interaction and HRI

MethodsFocus · ALIGN