Dynamic Context Adaptation for Consistent Role-Playing Agents with Retrieval-Augmented Generations

Jeiyoon Park; Yongshin Han; Minseop Kim; Kisu Yang

arXiv:2508.02016·cs.AI·February 6, 2026

Dynamic Context Adaptation for Consistent Role-Playing Agents with Retrieval-Augmented Generations

Jeiyoon Park, Yongshin Han, Minseop Kim, Kisu Yang

PDF

1 Datasets 3 Reviews

TL;DR

This paper introduces Amadeus, a training-free framework that improves persona consistency in retrieval-augmented role-playing agents, supported by a new dataset CharacterRAG for evaluation.

Contribution

It presents a novel, training-free method for enhancing persona consistency in RAG-based RPAs and provides a comprehensive dataset for evaluation.

Findings

01

Amadeus significantly improves persona consistency.

02

The dataset CharacterRAG enables rigorous evaluation.

03

RAG-based RPAs can model knowledge and personality attributes.

Abstract

Building role-playing agents (RPAs) that faithfully emulate specific characters remains challenging because collecting character-specific utterances and continually updating model parameters are resource-intensive, making retrieval-augmented generation (RAG) a practical necessity. However, despite the importance of RAG, there has been little research on RAG-based RPAs. For example, we empirically find that when a persona lacks knowledge relevant to a given query, RAG-based RPAs are prone to hallucination, making it challenging to generate accurate responses. In this paper, we propose Amadeus, a training-free framework that can significantly enhance persona consistency even when responding to questions that lie beyond a character's knowledge. In addition, to underpin the development and rigorous evaluation of RAG-based RPAs, we manually construct CharacterRAG, a role-playing dataset that…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 6Confidence 4

Strengths

The paper explicitly targets a common failure mode in RAG-based role-playing: when a user asks about aspects that are not explicitly in the persona, vanilla retrievers overuse low-relevance chunks and the agent hallucinates. The abstract and introduction motivate this crisply and position AMADEUS as training-free with three modules. ACTS preserves hierarchical context with empirical support that maximizes summed similarity and minimizes variance; ACTS/ATS outperform standard splitters across em

Weaknesses

CharacterRAG contains only 15 fictional characters, and much of the persona content is mined from Namuwiki; it remains unclear how well findings transfer to real people, evolving personas. Adding non-fictional or time-varying personas would strengthen claims. While ACTS’s hierarchical extraction cost is noted (O(N)), the end-to-end latency and token/dollar costs (especially for GS/AE with large models) are not reported in detail across LLMs/datasets, limiting deployment guidance. The related w

Reviewer 02Rating 6Confidence 4

Strengths

1. AMADEUS (with ACTS, GS, AE modules) fixes RAG-based RPAs’ hallucinations and poor persona consistency in out-of-knowledge queries, outperforming traditional RAG by enhancing chunking, filtering, and attribute extraction. 2. The manually built CharacterRAG (15 characters, 976K chars, 450 QAs) removes interference (e.g., editor’s inferences) and fills the lack of dedicated RAG-based RPA evaluation resources. 3. Using 3 LLMs, 3 embedding models, 3 baselines, and covering in/out-of-knowled

Weaknesses

1. The CharacterRAG dataset includes 15 fictional characters, but the paper does not specify their genre (e.g., anime, novel, film) or personality span (e.g., introverted vs. extroverted, heroic vs. villainous). If characters are concentrated in a single genre or share similar traits, the framework’s generalization to diverse role-playing scenarios (e.g., classical novel characters) remains unvalidated. 2. The Attribute Extractor (AE) only extracts "Belief and Value" and "Psychological Traits

Reviewer 03Rating 2Confidence 4

Strengths

i. This paper's attempt to improve the retrieval accuracy of persona information for RAG-based role-playing agents is meaningful. ii. The collection of a new dataset demonstrates the authors’ effort to empirically explore this problem and provides a potential resource for future studies.

Weaknesses

i. The paper is poorly written, and many essential details are missing, which makes it difficult to fully understand and reproduce the proposed approach. - The description of the CharacterRAG dataset construction process lacks sufficient detail. It is unclear how the persona documents were collected and processed, how the 450 QA pairs were generated, and what standards were used to filter unqualified documents. Furthermore, the authors do not discuss any measures taken to ensure the fidelity an

Code & Models

Datasets

naruto-soop/CharacterRAG
dataset· 34 dl
34 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.