A Note on Coding and Standardization of Categorical Variables in   (Sparse) Group Lasso Regression

Felicitas J. Detmer; Martin Slawski

arXiv:1805.06915·stat.CO·May 21, 2018

A Note on Coding and Standardization of Categorical Variables in (Sparse) Group Lasso Regression

Felicitas J. Detmer, Martin Slawski

PDF

TL;DR

This paper investigates the role of standardization in group lasso regression with categorical variables, showing that simple column scaling suffices instead of orthonormalization, simplifying implementation and improving performance.

Contribution

It demonstrates that column-wise scaling of the design matrix is equivalent to orthonormalization for categorical predictors in group lasso, simplifying standardization procedures.

Findings

01

Column scaling achieves the same effect as orthonormalization.

02

Proper standardization significantly improves model performance.

03

Extensions to sparse group lasso are also discussed.

Abstract

Categorical regressor variables are usually handled by introducing a set of indicator variables, and imposing a linear constraint to ensure identifiability in the presence of an intercept, or equivalently, using one of various coding schemes. As proposed in Yuan and Lin [J. R. Statist. Soc. B, 68 (2006), 49-67], the group lasso is a natural and computationally convenient approach to perform variable selection in settings with categorical covariates. As pointed out by Simon and Tibshirani [Stat. Sin., 22 (2011), 983-1001], "standardization" by means of block-wise orthonormalization of column submatrices each corresponding to one group of variables can substantially boost performance. In this note, we study the aspect of standardization for the special case of categorical predictors in detail. The main result is that orthonormalization is not required; column-wise scaling of the design…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.