Leveraging Distributional Semantics for Multi-Label Learning

Rahul Wadbude; Vivek Gupta; Piyush Rai; Nagarajan Natarajan; Harish; Karnick; Prateek Jain

arXiv:1709.05976·cs.LG·November 13, 2017

Leveraging Distributional Semantics for Multi-Label Learning

Rahul Wadbude, Vivek Gupta, Piyush Rai, Nagarajan Natarajan, Harish, Karnick, Prateek Jain

PDF

Open Access

TL;DR

This paper introduces a scalable label embedding framework for large-scale multi-label learning, inspired by distributional semantics and word embedding techniques, improving performance on benchmark datasets.

Contribution

It presents a novel connection between label embedding methods and paragraph/document embeddings, extending to incorporate label correlations and missing labels.

Findings

01

Outperforms several baselines on benchmark datasets.

02

Effectively incorporates label-label correlations.

03

Enables end-to-end learning with joint embedding and regression models.

Abstract

We present a novel and scalable label embedding framework for large-scale multi-label learning a.k.a ExMLDS (Extreme Multi-Label Learning using Distributional Semantics). Our approach draws inspiration from ideas rooted in distributional semantics, specifically the Skip Gram Negative Sampling (SGNS) approach, widely used to learn word embeddings for natural language processing tasks. Learning such embeddings can be reduced to a certain matrix factorization. Our approach is novel in that it highlights interesting connections between label embedding methods used for multi-label learning and paragraph/document embedding methods commonly used for learning representations of text data. The framework can also be easily extended to incorporate auxiliary information such as label-label correlations; this is crucial especially when there are a lot of missing labels in the training data. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Topic Modeling · Sentiment Analysis and Opinion Mining