Codified Finite-state Machines for Role-playing

Letian Peng; Yupeng Hou; Kun Zhou; Jingbo Shang

arXiv:2602.05905·cs.CL·February 6, 2026

Codified Finite-state Machines for Role-playing

Letian Peng, Yupeng Hou, Kun Zhou, Jingbo Shang

PDF

Open Access 3 Reviews

TL;DR

This paper introduces Codified Finite-State Machines (CFSMs) and CPFSMs, leveraging LLMs to automatically generate interpretable, probabilistic state models from character profiles for improved role-playing consistency and variability.

Contribution

The paper presents a novel framework for automatically creating finite-state machines from textual profiles using LLMs, enhancing role-playing with interpretable and probabilistic state modeling.

Findings

01

CFSMs outperform baseline methods in structured role-playing tasks.

02

CPFSMs effectively model uncertainty and variability in open-ended scenarios.

03

Both models demonstrate superior performance in synthetic and real-world evaluations.

Abstract

Modeling latent character states is crucial for consistent and engaging role-playing (RP) with large language models (LLMs). Yet, existing prompting-based approaches mainly capture surface actions, often failing to track the latent states that drive interaction. We revisit finite-state machines (FSMs), long used in game design to model state transitions. While effective in small, well-specified state spaces, traditional hand-crafted, rule-based FSMs struggle to adapt to the open-ended semantic space of RP. To address this, we introduce Codified Finite-State Machines (CFSMs), a framework that automatically codifies textual character profiles into FSMs using LLM-based coding. CFSMs extract key states and transitions directly from the profile, producing interpretable structures that enforce character consistency. To further capture uncertainty and variability, we extend CFSMs into Codified…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 3

Strengths

- The codification of character logic via FSMs, driven by LLMs, presents a novel mechanism to preserve behavioral coherence in long-form role-playing. - Experimental results show a clear improvement in behavioral consistency after introducing CFSM. Whether in synthetic tasks (e.g., Mario state transitions) or real narrative scenarios, characters’ state transitions become more coherent and believable. CFSM and CPFSM effectively reduce the confusion and inconsistency commonly observed in prompt-b

Weaknesses

The proposed framework heavily depends on the LLM to extract states and generate transition rules. If the LLM-produced code contains errors or omissions, it may compromise the correctness of the resulting finite-state machine. The paper provides limited discussion on how to validate or correct the logic generated by the LLM, leaving the reliability of the approach partially contingent on the quality of the LLM’s rule extraction process. Another concern lies in the current evaluation, which prim

Reviewer 02Rating 6Confidence 4

Strengths

1) The described methods work on various artifacts mentioned in the results, while demonstrating the strong performance against the baselines. 2) The paper mentions the computational complexity for the both methods and shows faster and efficient codification for the proposed methods. 3) This paper includes a very detailed analysis section mentioning synthetic and real plot experiments, and is tested with multiple LLM models and techniques, and has various kind of plots and scenes from various ge

Weaknesses

1) The “preliminary and denotation” introduces the necessary terminology but lacks examples and a lucid explanation, which can be really helpful for the readers and the general audience unaware of such methods. 2) The multi-modality and reactions of CPFSM lack depth and can be explained more clearly. 3) The real plot experience can briefly explain one of the artifacts used in the work as a running example. Not having this makes it lless intuitive for new readers.

Reviewer 03Rating 6Confidence 4

Strengths

1. Interpretability: The framework brings interpretability to state modeling in RP with executable, codified transitions derived directly from character profiles. 2. Probabilistic Extension: The CPFSM mechanism elegantly integrates stochasticity into state transitions, explicitly modeling uncertainty in RP. 3. Efficiency: CFSM delivers both accuracy and efficiency, as highlighted in Table 5.

Weaknesses

1. Evaluation Scope (Generality): Empirical testing relies primarily on the Fandom Benchmark and three synthetic state machines. The real-world scenarios are derived from highly narrativized, structured data (Fandom plots) with limited diversity of state-space complexity and ambiguity. GPT-4.1 is both judge and model in several settings, and open-ended role-play evaluations rely heavily on LLM judgment. There is insufficient third-party or human evaluation of RP quality, which may limit claims o

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Multimodal Machine Learning Applications · Human Motion and Animation