Statistical Inference for Smoothed Support Vector Machines in High Dimensions: From Offline to Online Data
Shuya Zhou, Junwen Xia, and Jingxiao Zhang

TL;DR
This paper introduces a unified framework for statistical inference in high-dimensional support vector machines, applicable to both offline and online data, using smoothing and debiasing techniques.
Contribution
It develops a novel smoothing-based debiased estimator for high-dimensional SVMs, enabling valid inference in offline and online settings.
Findings
The offline estimator achieves asymptotic normality and valid confidence intervals.
The online estimator provides real-time inference using only summary statistics.
Simulations and real data show improved inference accuracy and computational efficiency.
Abstract
High-dimensional classification problems often rely on the Lasso-penalized linear Support Vector Machines (SVMs). However, the double non-smoothness induced by the hinge loss and Lasso penalty in this model makes statistical inference challenging and impedes computational efficiency. In this paper, we propose a unified inference framework in both offline and online settings. In the offline case, by applying a convolution smoothing technique to the hinge loss, we construct a debiased estimator that eliminates the shrinkage bias, thereby building a valid confidence interval. For online streaming data, we develop a real-time estimator and inference procedure that relies only on summary statistics of historical data. Theoretically, we provide rigorous proofs for the asymptotic normality of our offline and online debiased estimators. Simulation studies and real data applications demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
