Comparing published multi-label classifier performance measures to the ones obtained by a simple multi-label baseline classifier
Jean Metz, Newton Spola\^or, Everton A. Cherman, Maria C., Monard

TL;DR
This paper demonstrates that many published multi-label classifiers perform worse than a simple baseline, highlighting the need for the community to compare against such baselines and provide explanations for poor results.
Contribution
It introduces General_B, a simple multi-label baseline classifier, and shows many published results are comparable or worse, urging better benchmarking and explanation practices.
Findings
Many published results are worse than or equal to General_B
Up to 43% of results on one dataset are comparable to the baseline
Most studies lack explanations for poor classifier performance
Abstract
In supervised learning, simple baseline classifiers can be constructed by only looking at the class, i.e., ignoring any other information from the dataset. The single-label learning community frequently uses as a reference the one which always predicts the majority class. Although a classifier might perform worse than this simple baseline classifier, this behaviour requires a special explanation. Aiming to motivate the community to compare experimental results with the ones provided by a multi-label baseline classifier, calling the attention about the need of special explanations related to classifiers which perform worse than the baseline, in this work we propose the use of General_B, a multi-label baseline classifier. General_B was evaluated in contrast to results published in the literature which were carefully selected using a systematic review process. It was found that a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Algorithms and Data Compression · Spam and Phishing Detection
