A clusterwise supervised learning procedure based on aggregation of distances
Aur\'elie Fisher (LPSM UMR 8001), Mathilde Mougeot (CMLA, ENSIIE, LPSM, UMR 8001), Sothea Has (LPSM UMR 8001)

TL;DR
This paper introduces a three-step clusterwise supervised learning method that automatically detects data clusters, fits models within each, and aggregates them, improving predictive accuracy on complex data with multiple underlying structures.
Contribution
The proposed KFC procedure innovatively combines clustering and model aggregation to handle data with multiple unknown clusters and diverse predictive models.
Findings
Excellent performance on simulated data
Effective on real-world prediction problems
Robust across various data distributions
Abstract
Nowadays, many machine learning procedures are available on the shelve and may be used easily to calibrate predictive models on supervised data. However, when the input data consists of more than one unknown cluster, and when different underlying predictive models exist, fitting a model is a more challenging task. We propose, in this paper, a procedure in three steps to automatically solve this problem. The KFC procedure aggregates different models adaptively on data. The first step of the procedure aims at catching the clustering structure of the input data, which may be characterized by several statistical distributions. It provides several partitions, given the assumptions on the distributions. For each partition, the second step fits a specific predictive model based on the data in each cluster. The overall model is computed by a consensual aggregation of the models corresponding to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Data Stream Mining Techniques · Bayesian Methods and Mixture Models
