# High-dimensional classification by sparse logistic regression

**Authors:** Felix Abramovich, Vadim Grinshtein

arXiv: 1706.08344 · 2018-11-20

## TL;DR

This paper introduces a new sparse logistic regression method for high-dimensional binary classification, providing theoretical bounds and a computationally feasible estimator that is rate-optimal under certain conditions.

## Contribution

It proposes a novel model selection procedure with a complexity penalty related to VC-dimension and extends the Slope estimator for logistic regression to high-dimensional settings.

## Key findings

- Derived non-asymptotic bounds for misclassification risk.
- Established the rate-optimality of the extended Slope estimator.
- Linked the complexity penalty to VC-dimension of sparse classifiers.

## Abstract

We consider high-dimensional binary classification by sparse logistic regression. We propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic bounds for the resulting misclassification excess risk. The bounds can be reduced under the additional low-noise condition. The proposed complexity penalty is remarkably related to the VC-dimension of a set of sparse linear classifiers. Implementation of any complexity penalty-based criterion, however, requires a combinatorial search over all possible models. To find a model selection procedure computationally feasible for high-dimensional data, we extend the Slope estimator for logistic regression and show that under an additional weighted restricted eigenvalue condition it is rate-optimal in the minimax sense.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.08344/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/1706.08344/full.md

---
Source: https://tomesphere.com/paper/1706.08344