UniDEC : Unified Dual Encoder and Classifier Training for Extreme Multi-Label Classification
Siddhant Kharbanda, Devaansh Gupta, Gururaj K, Pankaj Malhotra, Amit, Singh, Cho-Jui Hsieh, Rohit Babbar

TL;DR
UniDEC introduces a unified, end-to-end training framework for extreme multi-label classification that significantly reduces computational costs while achieving state-of-the-art results on large-scale datasets.
Contribution
It proposes UniDEC, a novel loss-independent, end-to-end trainable framework that combines dual encoder and classifier training with label subset selection to improve efficiency and performance.
Findings
Achieves state-of-the-art results on datasets with millions of labels.
Reduces training computational cost by 4-16x on large datasets.
Operates efficiently on a single GPU.
Abstract
Extreme Multi-label Classification (XMC) involves predicting a subset of relevant labels from an extremely large label space, given an input query and labels with textual features. Models developed for this problem have conventionally made use of dual encoder (DE) to embed the queries and label texts and one-vs-all (OvA) classifiers to rerank the shortlisted labels by the DE. While such methods have shown empirical success, a major drawback is their computational cost, often requiring upto 16 GPUs to train on the largest public dataset. Such a high cost is a consequence of calculating the loss over the entire label space. While shortlisting strategies have been proposed for classifiers, we aim to study such methods for the DE framework. In this work, we develop UniDEC, a loss-independent, end-to-end trainable framework which trains the DE and classifier together in a unified manner with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
