Towards Heterogeneous Long-tailed Learning: Benchmarking, Metrics, and Toolbox
Haohui Wang, Weijie Guan, Jianpeng Chen, Zi Wang, Dawei Zhou

TL;DR
This paper introduces HeroLT, a comprehensive benchmark for long-tailed learning across multiple domains, providing new evaluation metrics, datasets, and insights to advance research in handling imbalanced data distributions.
Contribution
It presents HeroLT, a benchmark integrating algorithms, metrics, and datasets, along with extensive experiments, to systematically evaluate and improve long-tailed learning methods.
Findings
HeroLT enables fair comparison of algorithms across diverse datasets.
Extensive experiments reveal strengths and weaknesses of current methods.
Open-source resources facilitate future research and reproducibility.
Abstract
Long-tailed data distributions pose challenges for a variety of domains like e-commerce, finance, biomedical science, and cyber security, where the performance of machine learning models is often dominated by head categories while tail categories are inadequately learned. This work aims to provide a systematic view of long-tailed learning with regard to three pivotal angles: (A1) the characterization of data long-tailedness, (A2) the data complexity of various domains, and (A3) the heterogeneity of emerging tasks. We develop HeroLT, a comprehensive long-tailed learning benchmark integrating 18 state-of-the-art algorithms, 10 evaluation metrics, and 17 real-world datasets across 6 tasks and 4 data modalities. HeroLT with novel angles and extensive experiments (315 in total) enables effective and fair evaluation of newly proposed methods compared with existing baselines on varying dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Artificial Intelligence in Healthcare · Imbalanced Data Classification Techniques
