Subtask Analysis of Process Data Through a Predictive Model
Zhi Wang, Xueying Tang, Jingchen Liu, Zhiliang Ying

TL;DR
This paper introduces a novel, computationally efficient method for analyzing complex process data from human-computer interactions by segmenting lengthy sequences into meaningful subtasks using predictability and entropy measures.
Contribution
It develops a new segmentation approach based on action predictability and Shannon entropy, enabling easier clustering and interpretation of process data.
Findings
Effective segmentation of process data into subtasks.
Improved clustering and interpretability of behavioral patterns.
Validated method through simulation and real data from PIAAC 2012.
Abstract
Response process data collected from human-computer interactive items contain rich information about respondents' behavioral patterns and cognitive processes. Their irregular formats as well as their large sizes make standard statistical tools difficult to apply. This paper develops a computationally efficient method for exploratory analysis of such process data. The new approach segments a lengthy individual process into a sequence of short subprocesses to achieve complexity reduction, easy clustering and meaningful interpretation. Each subprocess is considered a subtask. The segmentation is based on sequential action predictability using a parsimonious predictive model combined with the Shannon entropy. Simulation studies are conducted to assess performance of the new methods. We use the process data from PIAAC 2012 to demonstrate how exploratory analysis of process data can be done…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSensory Analysis and Statistical Methods · Advanced Clustering Algorithms Research
