Open Vocabulary Extreme Classification Using Generative Models
Daniel Simig, Fabio Petroni, Pouya Yanki, Kashyap Popat, Christina Du,, Sebastian Riedel, Majid Yazdani

TL;DR
This paper introduces GROOV, a generative seq2seq model for open vocabulary extreme classification, enabling the prediction of new labels outside a predefined set, thus addressing label incompleteness in real-world scenarios.
Contribution
The paper presents GROOV, a novel fine-tuned generative model for open vocabulary XMC that can generate unseen labels and uses a new loss function independent of label order.
Findings
GROOV predicts meaningful new labels outside the known vocabulary.
Performs comparably to state-of-the-art on known labels.
Effective in real-world, incomplete label scenarios.
Abstract
The extreme multi-label classification (XMC) task aims at tagging content with a subset of labels from an extremely large label set. The label vocabulary is typically defined in advance by domain experts and assumed to capture all necessary tags. However in real world scenarios this label set, although large, is often incomplete and experts frequently need to refine it. To develop systems that simplify this process, we introduce the task of open vocabulary XMC (OXMC): given a piece of content, predict a set of labels, some of which may be outside of the known tag set. Hence, in addition to not having training data for some labels - as is the case in zero-shot classification - models need to invent some labels on-the-fly. We propose GROOV, a fine-tuned seq2seq model for OXMC that generates the set of labels as a flat sequence and is trained using a novel loss independent of predicted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Domain Adaptation and Few-Shot Learning · Machine Learning in Bioinformatics
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence
