MOA: Multi-Objective Alignment for Role-Playing Agents
Chonghua Liao, Ke Wang, Yuchuan Wu, Ruoran Li, Fei Huang, Yongbin Li

TL;DR
This paper introduces MOA, a multi-objective reinforcement learning framework that enhances role-playing agents by optimizing multiple objectives simultaneously, leading to more capable and balanced agents.
Contribution
MOA is a novel multi-objective optimization strategy that trains RPAs on multiple rubrics and employs thought-augmented rollouts for improved diversity and quality.
Findings
MOA improves multi-dimensional role-playing performance over baselines.
An 8B model trained with MOA achieves performance comparable to strong closed-source models.
MOA effectively balances multiple objectives in role-playing agents.
Abstract
Role-playing agents (RPAs) require balancing multiple objectives, such as instruction following, persona consistency, and stylistic fidelity, which are not always perfectly aligned across different dimensions. While prior work has primarily relied on supervised fine-tuning or reinforcement learning with scalarized rewards, these approaches do not explicitly address the coordination of multiple reward dimensions during optimization. We present \textbf{MOA} (\textbf{M}ulti-\textbf{O}bjective \textbf{A}lignment), a reinforcement-learning framework that enables multi-dimensional, fine-grained rubric optimization for general RPAs. MOA introduces a novel multi-objective optimization strategy that trains simultaneously on multiple fine-grained rubrics to boost optimization performance. Additionally, to improve both output diversity and generation quality, we employ thought-augmented rollouts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
