Sparse modeling of categorial explanatory variables
Jan Gertheiss, Gerhard Tutz

TL;DR
This paper introduces two novel L1-penalty based shrinkage methods tailored for categorical predictors in regression, enabling factor selection and category clustering, with applications to real data and simulation studies.
Contribution
It proposes new shrinkage techniques specifically for categorical variables, addressing a gap in existing methods designed mainly for metric predictors.
Findings
Effective factor selection and category clustering demonstrated.
Methods outperform traditional approaches in simulation studies.
Application to Munich rent data illustrates practical utility.
Abstract
Shrinking methods in regression analysis are usually designed for metric predictors. In this article, however, shrinkage methods for categorial predictors are proposed. As an application we consider data from the Munich rent standard, where, for example, urban districts are treated as a categorial predictor. If independent variables are categorial, some modifications to usual shrinking procedures are necessary. Two -penalty based methods for factor selection and clustering of categories are presented and investigated. The first approach is designed for nominal scale levels, the second one for ordinal predictors. Besides applying them to the Munich rent standard, methods are illustrated and compared in simulation studies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
