MMRole: A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents
Yanqi Dai, Huanran Hu, Lei Wang, Shengjie Jin, Xu Chen, Zhiwu Lu

TL;DR
This paper introduces MMRole, a comprehensive framework for creating and assessing multimodal role-playing agents, including a new dataset, evaluation metrics, and a specialized agent, advancing multimodal sociological and emotional AI research.
Contribution
The paper presents MMRole, the first complete framework for developing and evaluating multimodal role-playing agents, with a new dataset, evaluation approach, and a specialized agent model.
Findings
MMRole-Agent outperforms baseline models in evaluations.
The dataset includes 85 characters, 11K images, and 14K dialogues.
Challenges include improving multimodal understanding and role consistency.
Abstract
Recently, Role-Playing Agents (RPAs) have garnered increasing attention for their potential to deliver emotional value and facilitate sociological research. However, existing studies are primarily confined to the textual modality, unable to simulate humans' multimodal perceptual capabilities. To bridge this gap, we introduce the concept of Multimodal Role-Playing Agents (MRPAs), and propose a comprehensive framework, MMRole, for their development and evaluation, which comprises a personalized multimodal dataset and a robust evaluation approach. Specifically, we construct a large-scale, high-quality dataset, MMRole-Data, consisting of 85 characters, 11K images, and 14K single or multi-turn dialogues. Additionally, we present a robust evaluation approach, MMRole-Eval, encompassing eight metrics across three dimensions, where a reward model is designed to score MRPAs with the constructed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSpeech and dialogue systems · Natural Language Processing Techniques · AI in Service Interactions
MethodsSoftmax · Attention Is All You Need
