Harnessing XGBoost for Robust Biomarker Selection of Obsessive-Compulsive Disorder (OCD) from Adolescent Brain Cognitive Development (ABCD) data
Xinyu Shen, Qimin Zhang, Huili Zheng, Weiwei Qi

TL;DR
This paper compares machine learning models, especially XGBoost, for identifying biomarkers of OCD from correlated neuroimaging data, emphasizing model robustness and feature selection accuracy.
Contribution
It demonstrates the effectiveness of XGBoost in handling multicollinearity and selecting relevant features in neuroimaging data for OCD prediction.
Findings
XGBoost outperforms other models in feature selection accuracy.
Simulated data effectively mimics real neuroimaging correlation structures.
XGBoost handles multicollinearity better than logistic regression and elastic nets.
Abstract
This study evaluates the performance of various supervised machine learning models in analyzing highly correlated neural signaling data from the Adolescent Brain Cognitive Development (ABCD) Study, with a focus on predicting obsessive-compulsive disorder scales. We simulated a dataset to mimic the correlation structures commonly found in imaging data and evaluated logistic regression, elastic networks, random forests, and XGBoost on their ability to handle multicollinearity and accurately identify predictive features. Our study aims to guide the selection of appropriate machine learning methods for processing neuroimaging data, highlighting models that best capture underlying signals in high feature correlations and prioritize clinically relevant features associated with Obsessive-Compulsive Disorder (OCD).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFirm Innovation and Growth
MethodsFocus
