Pitfalls of Assessing Extracted Hierarchies for Multi-Class Classification
Pablo del Moral, Slawomir Nowaczyk, Anita Sant'Anna, Sepideh Pashami

TL;DR
This paper critically examines methods for extracting class hierarchies in multi-class classification, highlighting pitfalls, proposing random hierarchies as benchmarks, and showing that hierarchy quality impacts performance mainly in complex, high-class datasets.
Contribution
The paper identifies common pitfalls in hierarchy extraction methods and demonstrates the use of random hierarchies as a baseline for evaluating their effectiveness.
Findings
Hierarchy quality can be irrelevant with powerful classifiers.
Random hierarchies serve as effective benchmarks.
Proper hierarchy selection significantly improves performance in complex datasets.
Abstract
Using hierarchies of classes is one of the standard methods to solve multi-class classification problems. In the literature, selecting the right hierarchy is considered to play a key role in improving classification performance. Although different methods have been proposed, there is still a lack of understanding of what makes one method to extract hierarchies perform better or worse. To this effect, we analyze and compare some of the most popular approaches to extracting hierarchies. We identify some common pitfalls that may lead practitioners to make misleading conclusions about their methods. In addition, to address some of these problems, we demonstrate that using random hierarchies is an appropriate benchmark to assess how the hierarchy's quality affects the classification performance. In particular, we show how the hierarchy's quality can become irrelevant depending on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Text and Document Classification Technologies · Anomaly Detection Techniques and Applications
