On-the-fly Global Embeddings Using Random Projections for Extreme Multi-label Classification
Yashaswi Verma

TL;DR
This paper introduces a simple, fast, and structure-preserving random projection-based embedding method for extreme multi-label classification that is independent of training data and label vocabulary, achieving competitive accuracy and significant speed-ups.
Contribution
It presents a novel on-the-fly global embedding technique using random projections, which is independent of training samples and label size, and demonstrates ensemble boosting for improved accuracy.
Findings
Achieves competitive accuracy on benchmark datasets.
Provides 6572x faster training compared to existing methods.
Reduces model size by approximately 14.7 times.
Abstract
The goal of eXtreme Multi-label Learning (XML) is to automatically annotate a given data point with the most relevant subset of labels from an extremely large vocabulary of labels (e.g., a million labels). Lately, many attempts have been made to address this problem that achieve reasonable performance on benchmark datasets. In this paper, rather than coming-up with an altogether new method, our objective is to present and validate a simple baseline for this task. Precisely, we investigate an on-the-fly global and structure preserving feature embedding technique using random projections whose learning phase is independent of training samples and label vocabulary. Further, we show how an ensemble of multiple such learners can be used to achieve further boost in prediction accuracy with only linear increase in training and prediction time. Experiments on three public XML benchmarks show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning
