Leveraging Linear Independence of Component Classifiers: Optimizing Size and Prediction Accuracy for Online Ensembles
Enes Bektas, Fazli Can

TL;DR
This paper introduces a geometric, linear independence-based framework to analyze and optimize ensemble classifier size, balancing accuracy gains with diminishing returns, supported by theoretical proofs and empirical validation.
Contribution
It presents a novel theoretical approach linking ensemble size to linear independence of classifiers, providing a method to determine optimal ensemble size for accuracy.
Findings
Increasing classifiers generally improves accuracy
A point of diminishing returns exists for ensemble size
Ideal size estimates may vary due to dataset factors
Abstract
Ensembles, which employ a set of classifiers to enhance classification accuracy collectively, are crucial in the era of big data. However, although there is general agreement that the relation between ensemble size and its prediction accuracy, the exact nature of this relationship is still unknown. We introduce a novel perspective, rooted in the linear independence of classifier's votes, to analyze the interplay between ensemble size and prediction accuracy. This framework reveals a theoretical link, consequently proposing an ensemble size based on this relationship. Our study builds upon a geometric framework and develops a series of theorems. These theorems clarify the role of linear dependency in crafting ensembles. We present a method to determine the minimum ensemble size required to ensure a target probability of linearly independent votes among component classifiers.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Complex Network Analysis Techniques · Mobile Crowdsensing and Crowdsourcing
