When does Diversity Help Generalization in Classification Ensembles?
Yijun Bian, Huanhuan Chen

TL;DR
This paper investigates how diversity affects the generalization of classification ensembles, revealing that increasing diversity improves generalization only within specific ranges and proposing pruning methods to optimize ensemble performance.
Contribution
It introduces a novel diversity measurement based on error decomposition and establishes a theoretical relationship with generalization error, along with practical pruning methods.
Findings
Diversity improves generalization only in certain ranges.
The proposed pruning methods effectively balance diversity and ensemble size.
Empirical results support the theoretical relationship between diversity and generalization.
Abstract
Ensembles, as a widely used and effective technique in the machine learning community, succeed within a key element -- "diversity." The relationship between diversity and generalization, unfortunately, is not entirely understood and remains an open research issue. To reveal the effect of diversity on the generalization of classification ensembles, we investigate three issues on diversity, i.e., the measurement of diversity, the relationship between the proposed diversity and the generalization error, and the utilization of this relationship for ensemble pruning. In the diversity measurement, we measure diversity by error decomposition inspired by regression ensembles, which decomposes the error of classification ensembles into accuracy and diversity. Then we formulate the relationship between the measured diversity and ensemble performance through the theorem of margin and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning
