On-the-fly Global Embeddings Using Random Projections for Extreme   Multi-label Classification

Yashaswi Verma

arXiv:1912.08140·cs.LG·October 18, 2021

On-the-fly Global Embeddings Using Random Projections for Extreme Multi-label Classification

Yashaswi Verma

PDF

Open Access

TL;DR

This paper introduces a simple, fast, and structure-preserving random projection-based embedding method for extreme multi-label classification that is independent of training data and label vocabulary, achieving competitive accuracy and significant speed-ups.

Contribution

It presents a novel on-the-fly global embedding technique using random projections, which is independent of training samples and label size, and demonstrates ensemble boosting for improved accuracy.

Findings

01

Achieves competitive accuracy on benchmark datasets.

02

Provides 6572x faster training compared to existing methods.

03

Reduces model size by approximately 14.7 times.

Abstract

The goal of eXtreme Multi-label Learning (XML) is to automatically annotate a given data point with the most relevant subset of labels from an extremely large vocabulary of labels (e.g., a million labels). Lately, many attempts have been made to address this problem that achieve reasonable performance on benchmark datasets. In this paper, rather than coming-up with an altogether new method, our objective is to present and validate a simple baseline for this task. Precisely, we investigate an on-the-fly global and structure preserving feature embedding technique using random projections whose learning phase is independent of training samples and label vocabulary. Further, we show how an ensemble of multiple such learners can be used to achieve further boost in prediction accuracy with only linear increase in training and prediction time. Experiments on three public XML benchmarks show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning