Dependencies in Item-Adaptive CAT Data and Differential Item Functioning Detection: A Multilevel Framework
Dandan Chen Kaptur, Justin Kern, Chingwei David Shin, Jinming Zhang

TL;DR
This paper introduces a multilevel logistic model to improve differential item functioning detection in computerized adaptive testing by accounting for dependencies induced by adaptive item selection.
Contribution
It proposes a two-level model that better controls spurious DIF and maintains competitive power compared to traditional single-level methods in CAT settings.
Findings
Two-level model improves control of spurious DIF in CAT.
Model performs well with shorter tests and smaller exposure rates.
Convergence varies systematically across conditions, affecting inference.
Abstract
Differential item functioning (DIF) detection is an important yet understudied problem in computerized adaptive testing (CAT). In this article, we proposed a two-level logistic model to improve DIF detection in CAT by explicitly accounting for nuisance effects arising from CAT-induced structural dependency. First, we conceptualized that adaptive item selection induces systematic dependencies among examinees and items through provisional ability estimates, whereas traditional single-level DIF methods assume independent observations and may yield misleading results in CAT settings. Then, using a numeric example and Monte Carlo simulations, we compared our proposed two-level model with competing single-level models under various CAT conditions, manipulating test length, exposure control, ability estimator, DIF type, and DIF prevalence. Item-level Type-I error and statistical power…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
