HumanLM: Simulating Users with State Alignment Beats Response Imitation

Shirley Wu; Evelyn Choi; Arpandeep Khatua; Zhanghan Wang; Joy He-Yueya; Tharindu Cyril Weerasooriya; Wei Wei; Diyi Yang; Jure Leskovec; James Zou

arXiv:2603.03303·cs.CL·March 5, 2026

HumanLM: Simulating Users with State Alignment Beats Response Imitation

Shirley Wu, Evelyn Choi, Arpandeep Khatua, Zhanghan Wang, Joy He-Yueya, Tharindu Cyril Weerasooriya, Wei Wei, Diyi Yang, Jure Leskovec, James Zou

PDF

Open Access

TL;DR

HumanLM introduces a novel user simulation framework that models underlying user states with reinforcement learning, significantly improving the realism and alignment of simulated responses compared to traditional imitation methods.

Contribution

The paper presents HumanLM, a new training framework that generates latent user states aligned with responses, enhancing user simulation accuracy beyond surface-level imitation.

Findings

01

HumanLM outperforms alternatives with 16.3% average improvement in alignment scores.

02

HumanLM achieves higher similarity to real user responses in real-time studies.

03

The Humanual benchmark provides extensive data for evaluating user simulators.

Abstract

Large Language Models (LLMs) are increasingly used to simulate how specific users respond to a given context, enabling more user-centric applications that rely on user feedback. However, existing user simulators mostly imitate surface-level patterns and language styles, which fail to reflect the underlying states of real users (e.g., beliefs and emotions). To address these limitations, we propose a novel training framework, HumanLM, which builds user simulators that accurately reflect real users. Our key insight is that, in addition to generating responses, the model should generate natural-language latent states that align with ground-truth responses through reinforcement learning. These latent states correspond to a set of psychologically grounded state dimensions that drive how real users respond. HumanLM further synthesizes these aligned latent states into responses that accurately…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling · Multimodal Machine Learning Applications