Robust and Sparse Generalized Linear Models for High-Dimensional Data via Maximum Mean Discrepancy

Xiaoning Kang; Lulu Kang

arXiv:2602.21132·stat.ME·February 25, 2026

Robust and Sparse Generalized Linear Models for High-Dimensional Data via Maximum Mean Discrepancy

Xiaoning Kang, Lulu Kang

PDF

Open Access

TL;DR

This paper introduces a robust, sparse estimation method for high-dimensional GLMs using a penalized MMD framework, improving variable selection and robustness against outliers and heavy-tailed noise.

Contribution

It develops a novel penalized MMD approach with efficient algorithms for robust high-dimensional GLMs, addressing variable selection and outlier resistance.

Findings

01

Outperforms classical penalized GLMs in simulations

02

Effective against high-leverage points and heavy-tailed noise

03

Provides computationally efficient approximation methods

Abstract

High-dimensional datasets are frequently subject to contamination by outliers and heavy-tailed noise, which can severely bias standard regularized estimators like the Lasso. While Maximum Mean Discrepancy (MMD) has recently been introduced as a "universal" framework for robust regression, its application to high-dimensional Generalized Linear Models (GLMs) remains largely unexplored, particularly regarding variable selection. In this paper, we propose a penalized MMD framework for robust estimation and feature selection in GLMs. We introduce an $ℓ_{1}$ -penalized MMD objective and develop two versions of the estimator: a full $O (n^{2})$ version and a computationally efficient $O (n)$ approximation. To solve the resulting non-convex optimization problem, we employ an algorithm based on the Alternating Direction Method of Multipliers (ADMM) combined with AdaGrad. Through extensive simulation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Methods and Models · Statistical Methods and Inference · Stochastic Gradient Optimization Techniques