A Scalable Curiosity-Driven Game-Theoretic Framework for Long-Tail Multi-Label Learning in Data Mining
Jing Yang, Keze Wang

TL;DR
This paper introduces a scalable game-theoretic framework that leverages curiosity-driven rewards to improve long-tail multi-label classification, especially in large label spaces, outperforming existing methods.
Contribution
It proposes a novel cooperative multi-player game framework that adaptively enhances learning on tail labels without manual tuning, integrating curiosity mechanisms with theoretical convergence guarantees.
Findings
Outperforms state-of-the-art methods on 7 benchmarks, including datasets with over 30,000 labels.
Achieves up to +1.6% P@3 improvement on Wiki10-31K.
Ablation studies confirm the effectiveness of both game cooperation and curiosity-driven exploration.
Abstract
The long-tail distribution, where a few head labels dominate while rare tail labels abound, poses a persistent challenge for large-scale Multi-Label Classification (MLC) in real-world data mining applications. Existing resampling and reweighting strategies often disrupt inter-label dependencies or require brittle hyperparameter tuning, especially as the label space expands to tens of thousands of labels. To address this issue, we propose Curiosity-Driven Game-Theoretic Multi-Label Learning (CD-GTMLL), a scalable cooperative framework that recasts long-tail MLC as a multi-player game - each sub-predictor ("player") specializes in a partition of the label space, collaborating to maximize global accuracy while pursuing intrinsic curiosity rewards based on tail label rarity and inter-player disagreement. This mechanism adaptively injects learning signals into under-represented tail labels…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Machine Learning and Data Classification · Imbalanced Data Classification Techniques
