Hybrid(Penalized Regression and MLP) Models for Outcome Prediction in HDLSS Health Data
Mithra D K

TL;DR
This paper introduces a hybrid modeling approach combining penalized regression and MLPs for improved diabetes outcome prediction in high-dimensional health data, demonstrating enhanced accuracy over traditional models.
Contribution
It presents a novel hybrid model that integrates XGBoost feature encoding with a lightweight MLP, improving predictive performance in HDLSS health datasets.
Findings
Hybrid model outperforms baseline models in AUC and accuracy
Code and scripts are publicly released for reproducibility
Demonstrates effectiveness in high-dimensional health data
Abstract
I present an application of established machine learning techniques to NHANES health survey data for predicting diabetes status. I compare baseline models (logistic regression, random forest, XGBoost) with a hybrid approach that uses an XGBoost feature encoder and a lightweight multilayer perceptron (MLP) head. Experiments show the hybrid model attains improved AUC and balanced accuracy compared to baselines on the processed NHANES subset. I release code and reproducible scripts to encourage replication.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare · Machine Learning in Healthcare · Imbalanced Data Classification Techniques
