An Efficient Large-scale Semi-supervised Multi-label Classifier Capable of Handling Missing labels
Amirhossein Akbarnejad, Mahdieh Soleymani Baghshah

TL;DR
This paper introduces a novel non-linear embedding-based multi-label classifier that effectively handles large-scale datasets, missing labels, label correlations, and unlabeled data, outperforming existing methods.
Contribution
The paper presents the first multi-label classifier that simultaneously addresses large-scale data, missing labels, tail label prediction, and unlabeled data exploitation using a non-linear embedding approach.
Findings
Outperforms state-of-the-art classifiers in prediction accuracy.
Reduces training time significantly compared to existing methods.
Effectively predicts infrequent tail labels.
Abstract
Multi-label classification has received considerable interest in recent years. Multi-label classifiers have to address many problems including: handling large-scale datasets with many instances and a large set of labels, compensating missing label assignments in the training set, considering correlations between labels, as well as exploiting unlabeled data to improve prediction performance. To tackle datasets with a large set of labels, embedding-based methods have been proposed which seek to represent the label assignments in a low-dimensional space. Many state-of-the-art embedding-based methods use a linear dimensionality reduction to represent the label assignments in a low-dimensional space. However, by doing so, these methods actually neglect the tail labels - labels that are infrequently assigned to instances. We propose an embedding-based method that non-linearly embeds the label…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Domain Adaptation and Few-Shot Learning
