BaZi-Based Character Simulation Benchmark: Evaluating AI on Temporal and Persona Reasoning

Siyuan Zheng; Pai Liu; Xi Chen; Jizheng Dong; Sihan Jia

arXiv:2510.23337·cs.CL·October 28, 2025

BaZi-Based Character Simulation Benchmark: Evaluating AI on Temporal and Persona Reasoning

Siyuan Zheng, Pai Liu, Xi Chen, Jizheng Dong, Sihan Jia

PDF

TL;DR

This paper introduces a novel benchmark and system for culturally grounded virtual character simulation using BaZi-based reasoning, significantly improving accuracy over existing large language models.

Contribution

It presents the first QA dataset for BaZi-based persona reasoning and a BaZi-LLM system that combines symbolic reasoning with LLMs for dynamic virtual characters.

Findings

01

30.3%-62.6% accuracy improvement over mainstream LLMs

02

Accuracy drops 20%-45% with incorrect BaZi info

03

First culturally grounded persona reasoning benchmark

Abstract

Human-like virtual characters are crucial for games, storytelling, and virtual reality, yet current methods rely heavily on annotated data or handcrafted persona prompts, making it difficult to scale up and generate realistic, contextually coherent personas. We create the first QA dataset for BaZi-based persona reasoning, where real human experiences categorized into wealth, health, kinship, career, and relationships are represented as life-event questions and answers. Furthermore, we propose the first BaZi-LLM system that integrates symbolic reasoning with large language models to generate temporally dynamic and fine-grained virtual personas. Compared with mainstream LLMs such as DeepSeek-v3 and GPT-5-mini, our method achieves a 30.3%-62.6% accuracy improvement. In addition, when incorrect BaZi information is used, our model's accuracy drops by 20%-45%, showing the potential of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.