Leveraging Distributional Semantics for Multi-Label Learning
Rahul Wadbude, Vivek Gupta, Piyush Rai, Nagarajan Natarajan, Harish, Karnick, Prateek Jain

TL;DR
This paper introduces a scalable label embedding framework for large-scale multi-label learning, inspired by distributional semantics and word embedding techniques, improving performance on benchmark datasets.
Contribution
It presents a novel connection between label embedding methods and paragraph/document embeddings, extending to incorporate label correlations and missing labels.
Findings
Outperforms several baselines on benchmark datasets.
Effectively incorporates label-label correlations.
Enables end-to-end learning with joint embedding and regression models.
Abstract
We present a novel and scalable label embedding framework for large-scale multi-label learning a.k.a ExMLDS (Extreme Multi-Label Learning using Distributional Semantics). Our approach draws inspiration from ideas rooted in distributional semantics, specifically the Skip Gram Negative Sampling (SGNS) approach, widely used to learn word embeddings for natural language processing tasks. Learning such embeddings can be reduced to a certain matrix factorization. Our approach is novel in that it highlights interesting connections between label embedding methods used for multi-label learning and paragraph/document embedding methods commonly used for learning representations of text data. The framework can also be easily extended to incorporate auxiliary information such as label-label correlations; this is crucial especially when there are a lot of missing labels in the training data. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Topic Modeling · Sentiment Analysis and Opinion Mining
