Impact of Validation Strategy on Machine Learning Performance in EEG-Based Alcoholism Classification
Tahir Cetin Akinci, Yuksel Celik, Omer Faruk Ertugrul

TL;DR
This study demonstrates that validation strategies significantly influence EEG-based alcoholism classification performance, with nested cross-validation providing more reliable estimates than standard methods.
Contribution
It introduces a validation-aware framework that assesses how evaluation methodology impacts reported performance in EEG-based alcoholism classification.
Findings
Nested cross-validation reduces optimistic bias in performance estimates.
AdaBoost achieved the highest accuracy of 78.3% with stable generalization.
Most performance differences between models are not statistically significant.
Abstract
Electroencephalography provides a non-invasive and cost-effective approach for analyzing neural patterns associated with alcohol dependence. However, reported classification performance in EEG-based alcoholism studies varies considerably, often due to differences in validation strategies rather than intrinsic model capability. This study presents a validation-aware machine learning framework to assess the impact of evaluation methodology on classification performance. A balanced multi-channel EEG dataset of 300 trials (150 alcoholic, 150 control) was analyzed using a structured feature representation combining statistical descriptors and spectral band interactions. Five classifiers, including support vector machines (linear and radial basis function kernels), random forest, k-nearest neighbors, and AdaBoost, were evaluated under standard and nested cross-validation protocols. Results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
