Sparse modeling of categorial explanatory variables

Jan Gertheiss; Gerhard Tutz

arXiv:1101.1421·stat.AP·January 10, 2011

Sparse modeling of categorial explanatory variables

Jan Gertheiss, Gerhard Tutz

PDF

TL;DR

This paper introduces two novel L1-penalty based shrinkage methods tailored for categorical predictors in regression, enabling factor selection and category clustering, with applications to real data and simulation studies.

Contribution

It proposes new shrinkage techniques specifically for categorical variables, addressing a gap in existing methods designed mainly for metric predictors.

Findings

01

Effective factor selection and category clustering demonstrated.

02

Methods outperform traditional approaches in simulation studies.

03

Application to Munich rent data illustrates practical utility.

Abstract

Shrinking methods in regression analysis are usually designed for metric predictors. In this article, however, shrinkage methods for categorial predictors are proposed. As an application we consider data from the Munich rent standard, where, for example, urban districts are treated as a categorial predictor. If independent variables are categorial, some modifications to usual shrinking procedures are necessary. Two $L_{1}$ -penalty based methods for factor selection and clustering of categories are presented and investigated. The first approach is designed for nominal scale levels, the second one for ordinal predictors. Besides applying them to the Munich rent standard, methods are illustrated and compared in simulation studies.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.