NGAME: Negative Mining-aware Mini-batching for Extreme Classification
Kunal Dahiya, Nilesh Gupta, Deepak Saini, Akshay Soni, Yajun Wang,, Kushal Dave, Jian Jiao, Gururaj K, Prasenjit Dey, Amit Singh, Deepesh Hada,, Vidit Jain, Bhawna Paliwal, Anshul Mittal, Sonu Mehta, Ramachandran Ramjee,, Sumeet Agarwal, Purushottam Kar, Manik Varma

TL;DR
NGAME introduces a lightweight mini-batch creation method for extreme classification that enables larger batch sizes, faster training, and higher accuracy, outperforming existing negative sampling techniques on benchmarks and in real-world search engine applications.
Contribution
NGAME provides a provably accurate in-batch negative sampling technique that reduces memory overhead, allowing larger mini-batches and improved training efficiency for deep extreme classification models.
Findings
Up to 16% more accurate on benchmark datasets.
Achieved 3% higher accuracy in search query retrieval.
Up to 23% gains in click-through-rates in live A/B tests.
Abstract
Extreme Classification (XC) seeks to tag data points with the most relevant subset of labels from an extremely large label set. Performing deep XC with dense, learnt representations for data points and labels has attracted much attention due to its superiority over earlier XC methods that used sparse, hand-crafted features. Negative mining techniques have emerged as a critical component of all deep XC methods that allow them to scale to millions of labels. However, despite recent advances, training deep XC models with large encoder architectures such as transformers remains challenging. This paper identifies that memory overheads of popular negative mining techniques often force mini-batch sizes to remain small and slow training down. In response, this paper introduces NGAME, a light-weight mini-batch creation technique that offers provably accurate in-batch negative samples. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Machine Learning and ELM
