A Generalized Framework for Predictive Clustering and Optimization
Aravinth Chembu, Scott Sanner

TL;DR
This paper introduces a flexible, generalized framework for predictive clustering that supports various cluster definitions and objectives, using MILP and scalable algorithms, demonstrated on real datasets.
Contribution
It proposes a novel unified optimization framework for supervised clustering with multiple cluster types and objectives, along with scalable algorithms for large datasets.
Findings
Framework can uncover interpretable cluster structures
Mixed-integer linear programming achieves global optimality
Scalable greedy algorithms perform well on large datasets
Abstract
Clustering is a powerful and extensively used data science tool. While clustering is generally thought of as an unsupervised learning technique, there are also supervised variations such as Spath's clusterwise regression that attempt to find clusters of data that yield low regression error on a supervised target. We believe that clusterwise regression is just a single vertex of a largely unexplored design space of supervised clustering models. In this article, we define a generalized optimization framework for predictive clustering that admits different cluster definitions (arbitrary point assignment, closest center, and bounding box) and both regression and classification objectives. We then present a joint optimization strategy that exploits mixed-integer linear programming (MILP) for global optimization in this generalized framework. To alleviate scalability concerns for large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Face and Expression Recognition · Remote-Sensing Image Classification
