Data aggregation can lead to biased inferences in Bayesian linear mixed models and Bayesian ANOVA: A simulation study
Daniel J. Schad, Bruno Nicenboim, Shravan Vasishth

TL;DR
This study shows that aggregating data in Bayesian linear mixed models and ANOVA can bias results, especially when assumptions like sphericity are violated, and recommends analyzing non-aggregated data with full random effects.
Contribution
The paper demonstrates how data aggregation biases Bayesian inferences and advocates for modeling individual trial data with full random effects to improve accuracy.
Findings
Aggregated data analysis can bias Bayes factors under sphericity violations.
Bayesian ANOVA on aggregated data can lead to biased conclusions if assumptions are violated.
Modeling non-aggregated data with full random effects reduces bias.
Abstract
Bayesian linear mixed-effects models and Bayesian ANOVA are increasingly being used in the cognitive sciences to perform null hypothesis tests, where a null hypothesis that an effect is zero is compared with an alternative hypothesis that the effect exists and is different from zero. While software tools for Bayes factor null hypothesis tests are easily accessible, how to specify the data and the model correctly is often not clear. In Bayesian approaches, many authors use data aggregation at the by-subject level and estimate Bayes factors on aggregated data. Here, we use simulation-based calibration for model inference applied to several example experimental designs to demonstrate that, as with frequentist analysis, such null hypothesis tests on aggregated data can be problematic in Bayesian analysis. Specifically, when random slope variances differ (i.e., violated sphericity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPsychometric Methodologies and Testing · Statistical Methods and Bayesian Inference · Statistical Methods and Inference
