Reinforcement Learning Measurement Model

Wenqian Xu; Feng Ji

arXiv:2605.09305·stat.ME·May 12, 2026

Reinforcement Learning Measurement Model

Wenqian Xu, Feng Ji

PDF

TL;DR

The paper introduces the Reinforcement Learning Measurement Model (RLMM), a scalable framework for analyzing sequential process data in assessments, improving estimation efficiency and interpretability over existing models.

Contribution

It proposes a novel RLMM that decouples person-level choice sensitivity from task-level value representation, enabling efficient analysis of larger, more realistic process data.

Findings

01

RLMM achieved higher accuracy than MDP-MM in simulations.

02

RLMM had substantially lower runtime as task complexity increased.

03

Estimated person parameters correlated positively with performance metrics.

Abstract

Interactive assessments generate sequential process data that are not well handled by conventional item response models. Existing MDP-based measurement approaches, such as the Markov decision process measurement model (MDP-MM, LaMar, 2018), link action choices to state-action values, but their reliance on person-specific tabular value functions makes them difficult to scale beyond small, fully enumerated tasks. We propose the Reinforcement Learning Measurement Model (RLMM), a measurement framework that decouples person-level choice sensitivity from task-level value representation through a shared parametric action-value function, making estimation more computationally efficient for larger process-data settings. The model combines a Boltzmann choice rule with normalized advantages, a soft Bellman consistency penalty, and a block-coordinate MAP procedure for joint estimation, while also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.