Novel Prediction Techniques Based on Clusterwise Linear Regression
Igor Gitman, Jieshi Chen, Eric Lei, Artur Dubrawski

TL;DR
This paper introduces two novel methods for applying Clusterwise Linear Regression to prediction tasks, addressing the challenge of assigning cluster labels to unseen data, and demonstrates their effectiveness on multiple datasets.
Contribution
The paper proposes predictive CLR and constrained CLR, novel approaches that enable CLR-based regression models to predict on new data effectively.
Findings
Both methods significantly outperform existing CLR-based regression.
Predictive CLR outperforms linear regression and random forest, comparable to support vector regression.
Constrained CLR achieves top performance with manageable computational cost.
Abstract
In this paper we explore different regression models based on Clusterwise Linear Regression (CLR). CLR aims to find the partition of the data into clusters, such that linear regressions fitted to each of the clusters minimize overall mean squared error on the whole data. The main obstacle preventing to use found regression models for prediction on the unseen test points is the absence of a reasonable way to obtain CLR cluster labels when the values of target variable are unknown. In this paper we propose two novel approaches on how to solve this problem. The first approach, predictive CLR builds a separate classification model to predict test CLR labels. The second approach, constrained CLR utilizes a set of user-specified constraints that enforce certain points to go to the same clusters. Assuming the constraint values are known for the test points, they can be directly used to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Advanced Clustering Algorithms Research
MethodsLinear Regression
