Collapsing Categories for Regression with Mixed Predictors
Chaegeun Song, Zhong Zheng, Bing Li, Lingzhou Xue

TL;DR
This paper presents a novel adaptive method for collapsing categories in regression models with mixed predictors, improving accuracy and interpretability by reducing categorical complexity using pairwise vector fused LASSO.
Contribution
It introduces a systematic, theoretically grounded approach for category collapsing in regression, applicable to various models, with an efficient algorithm and proven consistency.
Findings
Effective reduction of categorical complexity in simulations
Improved prediction accuracy demonstrated on Spotify data
Theoretical guarantees for category collapsing consistency
Abstract
Categorical predictors are omnipresent in everyday regression practice: in fact, most regression data involve some categorical predictors, and this tendency is increasing in modern applications with more complex structures and larger data sizes. However, including too many categories in a regression model would seriously hamper accuracy, as the information in the data is fragmented by the multitude of categories. In this paper, we introduce a systematic method to reduce the complexity of categorical predictors by adaptively collapsing categories in regressions, so as to enhance the performance of regression estimation. Our method is based on the {\em pairwise vector fused LASSO}, which automatically fuses the categories that bear a similar regression relation with the response. We develop our method under a wide class of regression models defined by a general loss function, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Stochastic Gradient Optimization Techniques · Imbalanced Data Classification Techniques
