Analyzing Process Data from Computer-Based Assessments: A Tutorial on Preprocessing, Feature Extraction, and Model-Based Inference
Daeun Hwangbo, Junyeong Park, Minjeong Jeon, Ick Hoon Jin

TL;DR
This paper presents a comprehensive framework for preprocessing, analyzing, and modeling process data from computer-based assessments, demonstrated through the PIAAC PS-TRE domain, with reproducible R code provided.
Contribution
It offers an end-to-end analytical pipeline including preprocessing, feature extraction, and model-based inference for process data, filling a gap in systematic guidance and cross-method consistency.
Findings
N-gram behavioral clusters reveal diagnostic differences among incorrect respondents.
Multidimensional scaling features effectively reconstruct behavioral variables.
Process-informed DIF analysis helps identify and reduce construct-irrelevant group differences.
Abstract
Computer-based assessments routinely generate detailed interaction logs -- commonly referred to as process data -- that record every action a respondent performs during task completion, yet systematic preprocessing guidance, integrated analytical workflows, and cross-method consistency checks remain scarce in the literature. This paper provides a unified, end-to-end analytical framework for analyzing process data from large-scale assessments -- covering the full pipeline from raw log preprocessing to model-based inference -- using the Programme for the International Assessment of Adult Competencies (PIAAC) Problem Solving in Technology-Rich Environments (PS-TRE) domain as an illustrative example. We first present a systematic preprocessing pipeline -- including timestamp correction, duplicate removal, action block consolidation, and LLM-assisted standardization -- that transforms raw…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
