Performance of Multi-group DIF Methods in Assessing Cross-Country Score Comparability of International Large-Scale Assessments
Dandan Chen (University of Illinois, Urbana-Champaign)

TL;DR
This study compares two multi-group DIF detection methods, the improved Wald test and generalized logistic regression, using TIMSS data, highlighting the Wald test's relative robustness for assessing test fairness in international assessments.
Contribution
It provides a comparative analysis of two DIF detection methods, demonstrating the improved Wald test's relative reliability over generalized logistic regression in large-scale assessments.
Findings
The improved Wald test is more established for multi-group DIF analysis.
Both methods detect DIF affecting test fairness.
Empirical results inform method selection in ILSA score analysis.
Abstract
Standardized large-scale testing can be a debatable topic, in which test fairness sits at its very core. This study found that two out of five recent multi-group DIF detection methods are capable of capturing both the uniform and nonuniform DIF that affects test fairness. Still, no prior research has demonstrated the relative performance of these two methods when they are compared with each other. These two methods are the improved Wald test and the generalized logistic regression procedure. This study assessed the commonalities and differences between two sets of empirical results from these two methods with the latest TIMSS math score data. The primary conclusion was that the improved Wald test is relatively more established than the generalized logistic regression procedure for multi-group DIF analysis. Empirical results from this study may inform the selection of a multi-group DIF…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPsychometric Methodologies and Testing · Advanced Statistical Methods and Models · Statistical Methods and Bayesian Inference
