Hybrid(Penalized Regression and MLP) Models for Outcome Prediction in HDLSS Health Data

Mithra D K

arXiv:2512.02489·cs.LG·December 3, 2025

Hybrid(Penalized Regression and MLP) Models for Outcome Prediction in HDLSS Health Data

Mithra D K

PDF

Open Access

TL;DR

This paper introduces a hybrid modeling approach combining penalized regression and MLPs for improved diabetes outcome prediction in high-dimensional health data, demonstrating enhanced accuracy over traditional models.

Contribution

It presents a novel hybrid model that integrates XGBoost feature encoding with a lightweight MLP, improving predictive performance in HDLSS health datasets.

Findings

01

Hybrid model outperforms baseline models in AUC and accuracy

02

Code and scripts are publicly released for reproducibility

03

Demonstrates effectiveness in high-dimensional health data

Abstract

I present an application of established machine learning techniques to NHANES health survey data for predicting diabetes status. I compare baseline models (logistic regression, random forest, XGBoost) with a hybrid approach that uses an XGBoost feature encoder and a lightweight multilayer perceptron (MLP) head. Experiments show the hybrid model attains improved AUC and balanced accuracy compared to baselines on the processed NHANES subset. I release code and reproducible scripts to encourage replication.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare · Machine Learning in Healthcare · Imbalanced Data Classification Techniques