Fitting Multiple Machine Learning Models with Performance Based   Clustering

Mehmet Efe Lorasdagi; Ahmet Berker Koc; Ali Taha Koc and; Suleyman Serdar Kozat

arXiv:2411.06572·cs.LG·January 31, 2025

Fitting Multiple Machine Learning Models with Performance Based Clustering

Mehmet Efe Lorasdagi, Ahmet Berker Koc, Ali Taha Koc and, Suleyman Serdar Kozat

PDF

Open Access 1 Repo

TL;DR

This paper proposes a clustering-based framework that groups data by feature-target relations to fit multiple models, improving performance on complex, real-world datasets, including streaming data scenarios.

Contribution

It introduces a novel clustering method that relaxes the single mechanism assumption, enabling multiple models to better capture data heterogeneity, especially in streaming contexts.

Findings

01

Significant performance improvements over traditional single-model methods.

02

Effective handling of streaming data with adaptive ensemble weights.

03

Validated on real-world datasets with diverse data distributions.

Abstract

Traditional machine learning approaches assume that data comes from a single generating mechanism, which may not hold for most real life data. In these cases, the single mechanism assumption can result in suboptimal performance. We introduce a clustering framework that eliminates this assumption by grouping the data according to the relations between the features and the target values and we obtain multiple separate models to learn different parts of the data. We further extend our framework to applications having streaming data where we produce outcomes using an ensemble of models. For this, the ensemble weights are updated based on the incoming data batches. We demonstrate the performance of our approach over the widely-studied real life datasets, showing significant improvements over the traditional single-model approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mefe06/function-clustering
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research · Data Mining Algorithms and Applications