Algorithms for Generalized Cluster-wise Linear Regression
Young Woong Park, Yan Jiang, Diego Klabjan, Loren Williams

TL;DR
This paper introduces a generalized version of cluster-wise linear regression (CLR) that handles multiple observations per entity, proposing various algorithms including exact, heuristic, metaheuristic, and modified methods, tested on retail SKU clustering data.
Contribution
It extends CLR to a generalized form with multiple observations per entity and develops multiple algorithms for its solution, including an exact method and heuristics.
Findings
Algorithms perform well on real-world retail data.
Metaheuristic and heuristic algorithms are effective for large instances.
The generalized CLR effectively clusters SKUs based on seasonal effects.
Abstract
Cluster-wise linear regression (CLR), a clustering problem intertwined with regression, is to find clusters of entities such that the overall sum of squared errors from regressions performed over these clusters is minimized, where each cluster may have different variances. We generalize the CLR problem by allowing each entity to have more than one observation, and refer to it as generalized CLR. We propose an exact mathematical programming based approach relying on column generation, a column generation based heuristic algorithm that clusters predefined groups of entities, a metaheuristic genetic algorithm with adapted Lloyd's algorithm for K-means clustering, a two-stage approach, and a modified algorithm of Sp{\"a}th \cite{Spath1979} for solving generalized CLR. We examine the performance of our algorithms on a stock keeping unit (SKU) clustering problem employed in forecasting halo…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Regression
